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To the memory of Prof Giorgio Ponzano 


This book is based on the lectures of the Synthetic Biology course I taught at the Harbin 
Institute of Technology. The course was conceived for first-year Master’s degree students 
of biology and biotechnology. They had some wet-lab practice but not a strong background 
in mathematics and computer science. My goal was to present to them an engineering 
approach to biology, different from the traditional one they had learned as undergraduate 
students. 

This text is organized in theoretical and “hands-on” chapters. The theoretical chapters 
explain the main concepts in designing DNA circuits and the general ideas of modeling 
biological systems, with reminders to electronics and physics. The hands-on chapters 
describe how to use open-source software to design, analyze, and simulate genetic 
networks. In particular, each piece of software is applied to the two circuits that marked 
the start of the Synthetic Biology era: the so-called Repressilator and Toggle Switch. 

For the more formal, mathematical parts of the book, my main references were three 
excellent Systems Biology texts, namely: An introduction to System Biology by Uri Alon 
(Chapman & Hall/CRC, 2006), Systems Biology by Edda Klipp et al. (Wiley-Blackwell, 
2009), and System Modeling in Cellular Biology edited by Zoltan Szallasi et al. (MIT 
press, 2006). 

The software adopted in my lectures were COPASI, for genetic circuit modeling, 
general analysis, and simulations; XPPAUT, to carry out a detailed analysis of the 
equilibrium states of biological circuits; BioNetGen, to illustrate the basics of rule-based 
modeling — an intuitive way to generate formal representations of complex biological 
networks; and ProMoT — with the Parts and Pools add-on — for the visual, modular design 
of synthetic gene circuits. 

The chapter that concludes the book is dedicated to structural genetic motifs that 
perform specific functions (e.g., oscillations in protein concentrations, logic operations). 
They are accompanied by the description of synthetic bio-circuits where they have been 
employed. 


vii 


viii Preface 


I hope this book will give the readers the necessary knowledge to both understand 
papers in computational Synthetic Biology and start designing and simulating their own 
synthetic gene circuits. 


Harbin, China Mario Andrea Marchisio 
January 2018 
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What You Will Learn in This Chapter 

This introductory Chapter presents the concepts that lie at the basis of Synthetic Biology, 
a new branch of Life Sciences also referred to as Life Engineering. Synthetic Biology 
aims at modifying living cells in such a way that they can carry out precise tasks. Possible 
applications range from biofuel production to drug synthesis, from cancer cell detection to 
removal of environmental pollutants. The main artefacts of Synthetic Biology are DNA 
circuits where sequences of DNA interact by exchanging proteins or RNA molecules. 
Importantly, circuit implementation in the wet-lab should be model-driven i.e. genetic 
networks should be associated with mathematical models. Modeling permits to simulate 
and evaluate circuit performance in silico before their actual realization in vivo. 


1.1 What Is Synthetic Biology? 


Synthetic Biology is a young discipline whose birth can be dated to the year 2000 with the 
publication, in Nature, of the first two so-called synthetic gene circuits: the Repressilator 
[13] and the Toggle Switch [21]. 

The goal of Synthetic Biology is to turn biology into an engineering science i.e. to 
establish a methodology for the model-driven design and implementation of DNA circuits. 
Synthetic gene circuits provide the host cells with new features and functions, different 
from their natural ones. By dealing with circuits — though made of DNA, RNA, proteins, 
and chemicals — it was straightforward to take electrical engineering as a reference science. 
In particular, three main concepts were borrowed from electronics to draw the foundations 
of Synthetic Biology [15]. They are 
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Fig. 1.1 Concepts from electronics. (a) Abstraction hierarchy. Standard components (symbols, 
from top to bottom: resistors, batteries, capacitors, solenoids) are composed into devices (symbols: 
diode — top, transistor — bottom) that in their turn are used to build up circuits. (b) Definition of 
the Ampere (A), the unit of the electrical current, as the time derivative of the charge (Q) i.e. the 
number of electrons (orange circles) that go through a wire section (faded rectangle) per unit of time 
(second) 


¢ standard components, 
¢ abstraction hierarchy, 
* composability. 


In electrical engineering standard components are resistors, capacitors, solenoids, 
batteries, etc. They are assembled into devices, such as diodes and transistors, that perform 
a specific function. Devices are then put together into circuits that carry out complex 
operations. Electronic basic components and devices are connected to each other because 
they share a unique input/output signal i.e. the electrical current. This flux of electrons is 
quantified (in Ampere) as the electric charge (measured in Coulomb) that goes through a 
section of a wire in one second (Ampere = Coulomb/s — see Fig. 1.1). 

In Synthetic Biology, basic components for bacterial genetic circuits are DNA traits 
characterized by precise functions either in transcription or translation. According to the 
MIT Registry of Standard Biological Parts (http://parts.igem.org), they are divided into 
categories such as promoters, ribosome binding sites (RBS), coding sequences (CDS) for 
proteins and small RNAs, and terminators. Promoters are DNA sequences where RNA 
polymerases bind and start DNA transcription into mRNA; RBSs are the place, on the 
mRNA, where ribosomes bind and initiate protein synthesis; coding sequences correspond 
to genes or encode for small RNAs; terminators are signals, along the DNA, for RNA 
polymerases to stop transcription and leave the double chain. Standard Biological Parts 
can be assembled into biological devices. An instance of bio-device is a transcription unit 
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Fig. 1.2 Biological components and circuits. (a) Symbols of Standard Biological Parts according to 
the SBOL (Synthetic Biology Open Language) representation [20]. (b) Three bio-devices (D1, D2, 
and D3) corresponding to as many transcription units (colored boxes) are arranged into a gene circuit. 
D1 and D2 encodes for two different repressors (R1 and R2) able to bind and repress D3 promoter 
(red lines ending with a bar). D3 produces the circuit output, a fluorescent protein (a line ending with 
a circle symbolizes translation). D3 alone gives the maximal fluorescence level. When D1 and/or D2 
are added to the circuit, reporter protein expression is down-regulated with a consequent decrease in 
the circuit fluorescence level 


that produces a reporter protein. This construct is realized by merging a promoter with 
an RBS, the fluorescent protein gene, and a terminator. If the promoter is regulated by 
transcription factors, this device can be used as a component of a bio-circuit where other 
transcription units synthesize the promoter regulatory proteins (see Fig. 1.2). 

What makes Standard Biological Parts composable is the flux of two different 
molecules i.e. RNA polymerase and ribosome that are the main actors in transcription and 
translation, respectively. RNA polymerases scan all the biological parts along a transcrip- 
tion unit, whereas ribosomes goes — along the mRNA - through the parts that have been 
transcribed. RNA polymerases and ribosomes are referred to as common signal carriers 
[15] and represent biological counterparts of the electrons. Their fluxes can be considered 
as bio-currents and are referred to as PoPS (Polymerases Per Second) and RiPS (Ribo- 
somes Per Second — see Fig. 1.3). In analogy to electronics, PoPS (RiPS) is defined as the 
number of RNA polymerases (ribosomes) that cross a section of DNA (mRNA) per second. 

In electronics, circuit implementation is preceded by computational design, analysis, 
and simulations. Indeed, the dynamics of electrical circuits can be faithfully predicted by 
applying the physical laws of electromagnetism. In order to affirm Synthetic Biology as 
a proper engineering science, gene parts and devices should be accompanied by rigorous 
mathematical models too. This would allow a reliable prediction of the dynamics of com- 
plex gene networks and drive their wet-lab implementation. Up to now, however, nothing 
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Fig. 1.3. Bio-currents. RNA polymerases (RNAp) molecules enter a transcription unit at the 
promoter and flow, as PoPS, through all the DNA parts until they meet a terminator and leave the 
DNA. Ribosomes bind the mRNA at the RBS and move, as RiPS, into a coding sequence. They 
leave the mRNA after recognizing a STOP codon 


comparable to the electromagnetism laws exists in biology. Several modeling techniques, 
which are presented in this book, have been proposed and proved to be able to reproduce, 
under precise assumptions, experimental data. Nevertheless, a theoretical/computational 
framework that provides accurate predictions about the output of synthetic gene circuits 
and help their practical implementation is still to come and its establishment represents a 
major challenge in Synthetic Biology. 


1.2 Parts, Pools, and Visual Gene Circuit Design 


A more detailed gene circuit representation regards three more molecules as common 
signal carriers because of the role they play in transcription and translation regulation 
[33]. They are transcription factor proteins, small antisense RNAs (sRNAs), and chem- 
icals. Transcription factors bind promoters by either competing with RNA polymerases 
(repressors) or recruiting RNA polymerases to the DNA (activators); small RNAs bind 
complementary sequences on the mRNA and either form (more frequently) or remove 
hairpin structures that hinder the ribosome flux along the mRNA chain; chemicals 
(also referred to as environmental signals) carry out a control both on transcription and 
translation. The former is achieved by binding and activating or de-activating transcription 
factor proteins, the latter by docking at mRNA structures such as riboswitches and 
ribozymes. Each new signal carrier corresponds to a new bio-current. They are termed 
FaPS (Factors Per Second), RNAPS (RNAs Per Second), and SiPS (Signals Per Second), 
respectively. 

In this enlarged picture all five signal carriers are associated with new entities called 
Pools. They represent the place, in the cell, where free molecules of signal carriers are 
stored. Therefore, the contents of pools are biological potentials on which circuit dynamics 
depends. To make an analogy with electronics, pools can be seen as bio-batteries. 
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Fig. 1.4 Gene circuit design with parts and pools. (a) Symbols for pools. (b) Representation of the 
circuit in Fig. 1.2 with the insertion of RNA polymerase, ribosome, and transcription factor pools. 
RNA polymerase and ribosome pools are connected to each transcription unit (blue and green lines, 
respectively). Repressor R1 pool is the interface between D1 and D3, whereas R2 pool puts in 
communication D2 and D3 


Graphically, pools are interfaces either among transcription units (see Fig. 1.4) or, in 
the case of chemicals, between the whole circuit and the surrounding environment. 

Parts, pools, and fluxes of signal carriers permit the modular modeling and compu- 
tational design of synthetic gene circuits. As we will see in Chap.9, with the software 
Parts & Pools [32] — an add-on of ProMoT (Process Modeling Tool [39]) — gene circuits 
are realized in a “drag & drop” way by, first, displaying parts and pools on the computer 
screen and, then, connecting them via hypothetical wires through which signal carriers 
are supposed to flow. This computational, visual design style was inspired by software for 
electronics (see, for instance, P-Spice [41]). 


Take Home Message 

¢ Synthetic Biology is an engineering science that combines mathematical modeling, 
computer simulations, and wet-lab implementation of DNA circuits. 

¢ Synthetic gene circuits are organized in transcription units that interact by exchanging 
proteins and small RNAs. 

¢ In bacterial cells a transcription unit, which leads to protein production, is made of 
four basic DNA components: promoter, ribosome binding site, coding sequence, and 
terminator. 

* Concepts such as standard biological parts, pools, and fluxes of signal carriers facilitate 
the modular modeling and computational design of synthetic gene circuits. 


Check for 
updates 


What You Will Learn in This Chapter 

The concept of mathematical model is generally new to students of Life Sciences. 
Biological systems are usually described with cartoon schemes that only illustrate the 
interactions among different species. A model — as it is conceived in physics — is a 
mathematical representation that permits to predict the dynamics of a system (i.e. how 
species concentrations change over a given time interval) under precise assumptions and 
after defining the system initial conditions. The notion of predictive and quantitative model 
is here clarified with an example from classical physics. 

In Synthetic Biology, the definition of a model demands to delineate the species and 
the bio-chemical reactions that are essential for a proper estimation of the working of 
a genetic circuit. The way species interact is determined by a kinetics. Mass action 
kinetics is adopted to derive a mechanistic model that aims at a thorough reconstruction 
of the interactions that take place inside a circuit. A phenomenological model, in contrast, 
tries to reproduce patterns from experimental data. Kinetics based on Hill functions are, 
in this case, generally employed. In genetic networks, Hill functions are particularly 
useful to describe the activation and repression of fundamental mechanisms such as DNA 
transcription. 
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8 2 Modeling: Choosing a Kinetics 


The main goal of this Chapter is to learn how to build a mathematical model for simple 
synthetic constructs. In particular, you will learn how a set of species and reactions, 
once associated with a kinetics, can be translated into a system of ordinary differential 
equations, whose solution gives the system dynamics. 


2.1 What Is a Model? 


Traditionally, biological systems are associated with cartoon diagrams that provide a 
graphical representation of the interactions among different species. Cartoon schemes 
are useful to get an intuitive understanding of a system but they cannot be considered 
as models since they do not give any quantitative description. Furthermore, a model must 
be predictive. This means that a model should be able not only to explain a set of observed 
data but also to foresee — within a reasonable precision — the results you will obtain by 
repeating, under different initial conditions, the experiment the data comes from. However, 
you have to take care that the assumptions for which the model was proved to be valid are 
respected during a new experiment. 

Let us forget about biology for an instant and think of a simple physics experiment. It 
is well-known that the motion of a falling body is faithfully described by the mathematical 
equation s = 5 gt? if the friction of the air is negligible (in the formula, s stands for space, 
t for time, and g = 9.81 m/s? is the gravity acceleration). If you want to check the validity 
of this model, you can think to first calculate and then measure the time a solid body (a 
stone perhaps) takes to reach the ground after falling from different known heights (initial 
conditions). If, for instance, you let the stone fall from 5, 10 and 15m, you will measure 
something very close to 1.01, 1.43, and 1.75 s, respectively. However, if you decide to let 
a paper sheet fall instead of a stone, you will see that the paper sheet will take much longer 
to reach the soil than the just calculated times. Shall you then conclude that the model is 
wrong? No, the model works properly (i.e. it is predictive) as long as the assumption of no 
friction between the falling body and the air is verified. This is valid for a stone but not for 
a piece of paper. 

Finding a predictive mathematical representation for a synthetic gene circuit is tricky 
because the number of species and reactions to deal with is generally high, even in 
the presence of just a handful of transcription units. Moreover, kinetics parameters (the 
counterpart of g in the example above) are often too numerous and hardly measurable. 
In Systems Biology, however, different techniques have been developed to analyze the 
structure and simulate the temporal behavior of biological systems. We will make use of 
them to build up models for synthetic gene networks. 
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2.2 Mass Action Kinetics 


Let us consider, as a simple synthetic construct, a transcription unit whose products are 
proteins (monomers) that, once synthesized, dimerize. Our goal is to write a model for this 
synthetic device and, according to it, estimate the species concentrations! at steady state* 
without recurring to any piece of software for model simulations (as, in contrast, we will 
do in Chap. 3). 

First, we have to define the species to be included in our model. A transcription unit is 
made of four different parts (promoter, RBS, CDS, and terminator). RNA polymerases 
transcribe the RBS and the coding sequence into mRNA that is, then, translated into 
monomers by the ribosomes. Pairs of monomers associate and form dimers. Moreover, 
promoters and mRNAs lay, at least, into two different states: inactive and active (i.e. 
unbound by and bound to RNA polymerases and ribosomes, respectively). 

If we do not have enough data regarding each genetic part, we can simplify the system 
design and use the sole promoter as a representation of a whole transcription unit. Another 
approximation commonly used is to treat translation as a single step event by neglecting 
DNA transcription. This implies that mRNA and ribosomes will not appear in the model. 
Finally, since RNA polymerase amount is much bigger than that of the promoter (the 
latter can be present inside the cell in just a single copy), we can suppose that RNA 
polymerase concentration will not change considerably due to the interactions with the 
synthetic transcription unit. Thus, its explicit knowledge is not necessary for accurate 
model predictions.* As a result, we can omit RNA polymerase from the model. Overall, 
we need only four species in our model: inactive promoter (P), active promoter (P*), 
monomers (M), and dimers (D). A cartoon representation of this system is given in 
Fig. 2.1. 

Once we have defined the species in the system, we have to determine how they 
interact. In order to make use of mass action kinetics, we shall identify the bio-chemical 
reactions that take place in the system and associate each of them with a rate constant i.e. 
a parameter that is proportional to the frequency at which the reaction occurs.* 

We can suppose that the promoter switches between the active and inactive state (this 
takes into account implicitly the binding and unbinding of RNA polymerase); active 


‘Concentrations are measured in molars, M. The concentration of one Molar corresponds to a mole 
of a substance in a volume of one litre (J). Hence, M = moles/1. A mole is defined as the quantity 
of a substance that contains the Avogadro’s number (Nay = 6.022 1023) of molecules. 


Generally, when a system reaches the steady state — or equilibrium — species production and 
degradation are balanced such that species concentrations no longer change over time. 
3 At the end of this section we will see under which condition this assumption is actually correct. 


(k1,k-1) 
4Ina generic reversible reactionaA+bB = cC+dD, A and B are the reactants, C and D 


the products, whereas a, b, c, and d represent the stoichiometric coefficients. k; and k_, are the rate 
constants associated with the forward and the reverse reaction, respectively. 
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RNA polymeras 


qh Monomers Dimer 
a 


© translation 


Promoter (P/P*) 


Fig. 2.1 Cartoon representation of our system. The presence of RNA polymerases is not taken 
into account explicitly. However, its action is not neglected since the promoter can switch between 
two states: active (P*) i.e. bound to RNA polymerases and able to start monomer synthesis, and 
inactive (P) i.e. unable to produce any protein. In this framework translation is a single-step event 
(transcription is neglected) 


promoters produce monomers and go back to their inactive state; monomers dimerize 
(reversible reaction); dimers and monomers are finally degraded. Overall, we have seven 
different reactions 


(kik) 
* ky 
p* 5 Pim 
(5,€) 
2M = D 
k 
M dM 
kab 
DBS, (2.1) 


According to the law of mass action, the rate of a bio-chemical reaction (i.e. the change 
in the concentration of the reactants per unit of time or, in other words, the time derivative 
of the reactant concentrations) is given by the product of the reactant concentrations to the 
power of their stoichiometry times the reaction rate constant. This quantity accounts for 
the probability of a collision among the reactants under the assumption that the system is 
well-stirred i.e. spatially homogeneous. 

How does, for instance, the dimer concentration, D,. change over an infinitesimal time 
interval dt according to mass action kinetics? Dimers are involved in three reactions whose 
kinetic parameters are 6, €, and kgp. The rate (also called speed or flux and labelled with 


5 More properly, one should write [D](t) to indicate a species concentration as a function of time. 
However, for the sake of simplicity, we will mainly use, throughout this book, only the species names 
to refer to species concentrations. Moreover, the time dependence will be also generally omitted from 
our notation. 
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the Greek letter v) of the dimerization reaction is given by v’ = 5M?. The speed at which 
dimers turn back to the monomer configuration corresponds to v* = € D, whereas the rate 
of dimer decay is given by v¢? = kapD. If we subtract v* and v4? to v°, we will obtain 
the time derivative of the dimer concentration 

dD 

Ti: v® — vp — vkaD = §M* — (€ + kap)D. (2.2) 

Equation (2.2) is an ordinary differential equation (ODE).° A different ODE is 

associated with each of the other three species involved in Eqs. (2.1) such that the system 
dynamics (i.e. the values the species concentrations assume over a given time interval) is 
computed by solving a system of four ODEs. In a compact form we can write that 


dS 
—=Np), (2.3) 

dt 
where S is the species concentration vector and v is the flux vector. They are both column 
vectors whose transposes correspond to 


Ss’ = (P, P*, M, D) 
v’ = (ki P,k_1 P*, ko P*,5M*, €D, kamM, kapD). (2.4) 
In Eq. (2.3) N represents the stoichiometric matrix, whose generic element Nj; is defined 


as the stoichiometry of the species i in the reaction j times —1 if the species is a reactant 
(and, of course, | if the species is a product). In our example, the stoichiometric matrix is 


-! 1 10 0 0 0 
N= 1-1-1 0 0 0 0 (2.5) 
0 0 1-2 2-1 0 


0 0 0 1-1 O-!1 


Equation (2.3) can be expanded as 


= —ky P + (kK_1 + kp) P* 


d 
dP* 
= kj P — (k-1 + ky) P* 

Th 1 (kK_-1 +k) 

dM 

ap = ky P* — 2(8M* —€D) — kamM 

dD ; 

aT dM? — (€+kap)D. (2.6) 


SAll quantities involved in an ODE depend on a single independent variable, time in our case. 
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In general, ODE systems are solved numerically. As we will see in Chap. 3, software 
such as COPASI (Complex Pathway Simulator) [24] permit to visualize the concentration 
values of our four species at different time points. However, here we want to estimate 
species concentrations at steady state without the help of any piece of software. Steady 
state means no change in species concentrations, which corresponds to 


dS 
apa! (2.7) 
Therefore, Eqs. (2.6) become 
0 = —kiP + (k_1 + ky) P* 
0 = ky P — (k_1 + ky) P* 
0 = ky P* — 2(5M* —€D) —kayM 
0 = 5M* —(€+kgp)D. (2.8) 


Before solving the algebraic system in Eqs. (2.8), we can try to simplify it by looking 
for conserved quantities in the system. Their amount (Q) is given by the relation 


Q=n-rank(N) =4-3=1 (2.9) 


where n is the number of species and rank(N) corresponds to the number of rows (or 
columns) of N that are linearly independent. ' Hence, one quantity is conserved. This 
implies that one equation in Eqs. (2.8) can be replaced by a conservation relation. Notice 
that N has indeed only three rows that are linearly independent since the first row is equal to 
second one multiplied by —1. Looking at our bio-device, it is clear that the total promoter 
concentration Pr = P + P* is conserved during an experiment (or a simulation). This 
is consistent with the fact that the rows of N corresponding to P and P* are linearly 
dependent. 

We can replace the first equation in Eqs. (2.8) with the rearranged conservation relation 
P = Pr — P*. Since we are calculating concentrations at steady state, from now on we 
will add the suffix “ss” to each species in our system. Equations (2.8) become 
Pss = Pr — P. < 


0 = —ky Pss + (k-1 + kn) PS 
0 = ky PX — 2(5M;, — €Dss) — kam Mss 
0 = 6M2. — (€ + kap)Dss. (2.10) 


7Given three vectors v, w and z, they are linearly dependent if, for instance, z can be written as 
z= av-+ bw, wherea,be R. 
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After some algebraic steps, Eqs.(2.10) give the following expressions for species 
concentrations at steady state 


Pr 
Py = k 
—1tke 

1+ i; 
Pss = Pr = Pe 

€+kap 
Mi, = <4 p, 

1 * 

Dss = ap 62s —_ kam Mss). (2.11) 


P* depends only on three parameter values and a constant concentration (Pr). 
Therefore, P;« can be treated as a constant as well in Eqs. (2.11), let us call it w. Pss 
is calculated by subtracting uw to Pr and its value has no effect whatsoever on Ms, and 
Dss. This implies that Eqs. (2.11) can be reduced to a system (of the second order) of two 
algebraic equations in the variables M,, and D,,. If we define 


ky 
ay = —— 
2kap 
kam 
a. = —— 
2kap 
€+kap._ 
p=(— (2.12) 


the system of algebraic equation we have to solve can be written in a compact form as 


BM:, +a2Ms; — a4 = 0 
Dss — ah + a2Ms, = 0. (2.13) 


With the choice of parameters values as is Table 2.1, we can easily calculate that, at 
steady state, our system is made of 193 dimers, 44 monomers, and the only promoter in 
the system is almost completely active (P*, > 0.99 Pr).® Finally, it should be noted that 
we have to assume that the cell volume is equal to 1.66 10~!> 1 (a reasonable value for E. 
coli cells) in order to assign to a single molecule the concentration of 10~? M, as it is the 
case of our promoter.” 


8The constants in Eqs. (2.12) take the following values: a; =< 215.52, a2 = 0.5, B © 10° M, and 
pe 1079 M. 


°To 1 molecule correspond 1/Nay moles 1.e. mins 10-23 = 0.166 10-23 moles (let us call this 


quantity N,,). If we require that a single molecule is associated with a concentration C = 10-9 M, 
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Table 2.1 Parameter values 


Parameter | Value Unit 
(rate constants and a a) 
concentration) for the 7 10 M 
biological system associated ky 10° a 
with Eqs. (2.11). (They are k_| 0.01 s_! 
taken from [33]) ko 0.5 gv! 

5 10° M7!s7! 

€ 10 sl 

kam 0.00116 |s—! 

kap 0.00116 | s—! 


2.2.1 Adding a Species to the Model: RNA Polymerase 


If we want to consider RNA polymerase (RN Ap) explicitly in our model, we have to 
change a couple of reactions in Eqs. (2.1), namely 


(kik-1) 
RNAp+P = P 


p* ©, p4M+RNAp. (2.14) 


For the sake of simplicity, we can neglect the reactions of RNA polymerase generation and 
degradation. RN Ap dynamics is given by the following ordinary differential equation 


dRNAp 


a TERN AD P + (ki + ky) P* (2.15) 


that, non surprisingly, coincides with the ODE associated with P. Since RNA polymerase 
total concentration is conserved, it holds that 


RNApr = RNAp + P*. (2.16) 


Following the same procedure as above, P;, becomes 


ss k_yj+ko 1 ’ (2.17) 


a ky " RNADss 


whereas for M;; and Ds; hold the same expressions as in Eqs. (2.13). In Eq. (2.17), 
RNAps;s represents the molecules of RNA polymerase that are not bound to the promoter 


the volume V;¢7; (in litres) containing this single molecule has to be set to: Vee7,? = Nm/C = 
0.166 10-73 . 10? = 1.66107 91. 
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at steady state. Let us set y = ae (with the parameter values in Table 2.1, 


y= 5.1 10-° M). Equation (2.17) is rewritten as 
Pr 


* 
SS 


Ss , (2.18) 
Lf RN Apa 


We can study how P* varies with respect to RN Apss. Equation (2.17) coincides with 
the expression of P* in Eqs. (2.11) when RN Apss is equal to 1 molar — a huge quantity. 
For an infinite amount of RN Ap, we would have that P* = Pr at steady state. This is 
not very different from the result we got in our previous analysis i.e. P=, > 0.99 Pr. 
If RNApss = y, one half of the promoters in the system is active at steady state 
(PZ = 5Pr). Therefore, we can make qualitative predictions on the system at equilibrium 
by taking y as a reference concentration. A reasonable value for RN Apr in E. coli is 
2.110-°M [31,33]. Since, in our system, this concentration is much higher than Pr, we 
can consider that RNApss ~* RN Apr. This value — represented with a star (*) in Fig. 2.2 
— is lower than y. Therefore, we expect that P* is smaller than Pr/2, which implies 


a reduced amount of monomers and dimers at steady state. By solving Eq. (2.13) with 


w= wo: our qualitative predictions are confirmed because the number of monomers 


RNApr 
and dimers at the equilibrium is decreased to about 23 and 51 molecules, respectively. 


Hence, RNA polymerase amount has no influence on model prediction only when the 
condition RN Apr > y is satisfied. 


It should be noted that, under the assumption that RNAps; ~ RN Apr, we can define 
another parameter, for instance y = ate , and study how P* changes by varying the 


RNA polymerase-promoter binding rate constant (k1) according to the formula 


Pt = a (2.19) 


Fig. 2.2 Steady state 
concentration of active 
promoter (P*,) as a function of 
free RNA polymerase 
concentration at the 
equilibrium, RN Apss 


* Y 4M RNAp,g. (M) 
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With our choice of parameter values, y = 2.410°M~!s~!. Not surprisingly we find again 
that P= < Pr/2 since ky < y. By increasing k;, we would have a higher concentration 


of active promoter at steady state and, therefore, a higher amount of proteins as well. 


2.3 Hill Functions 


Hill-function-based kinetics is a de facto standard for the characterisation of regulated 
promoters because it mainly demands the knowledge of two parameters (the so-called Hill 
coefficient m and Hill constant K 7) that can be faithfully estimated in wet-lab experiments. 
In this representation, RNA polymerase is omitted and a promoter is bound only by 
activator and/or repressor proteins. 


2.3.1 Activated Promoters 


Let us consider that n molecules of the activator A bind an inactive promoter P and activate 
it (P*). Active promoters lead the production of a reporter protein that we will refer to as 
F (see Fig. 2.3). The production rate of F is proportional to the concentration of active 
promoters in the system. As mentioned above, RNA polymerase is neglected. Moreover, 
translation is considered as a single-step event. 

Under the hypothesis that A concentration is constant, the system reactions are 


(kk) 
nA+P = P* 


k 
pe ap Peek 


kik 
kor 
k 
Fs (2.20) 

Fig. 2.3. Promoter P is 
activated (P*) and produces 
the reporter protein F only 
when bound by n molecules of Activators (A) 
the activator A 


| Reporter (F) 


a 


(2 © translation 


Promoter (P/P*) 
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The only reversible reaction in Eqs. (2.20) follows from the fact that a promoter 
changes state only upon binding/unbinding the activator proteins. The third reaction in 
Egs. (2.20) accounts for F production due to promoter leakage provoked by the binding 
of RNA polymerase (normally at low frequency) to inactive promoters P.!° Promoter 
concentration is conserved in this system. Therefore, it holds that Pr = P + P*. 

The reversible reaction in Eqs. (2.20) allows us to write the ordinary differential 
equation that describes the dynamics of active promoters 


dP* 
dt 


=k A"P —kP*. (2.21) 


With the steady state condition and the conservation of promoter concentration, 
Eq. (2.21) becomes 


0 =k, A" Pr — (kj A" + k_1)P* (2.22) 
from which we derive that 


MP ppc, ye (e)” 
Ki, +A" 1+(4y 


* 


kj A" +k_1 


Pr. (2.23) 


The fraction in Eq. (2.23) is a Hill function, where K y : Kiy = k_,/k, is the Hill constant 
and n the Hill coefficient. The former gives the concentration of A necessary to activate one 
half of the promoters in the system; the latter quantifies the cooperativity effects among 
activator proteins binding a promoter. As we will see later on in more details, n gives the 
slope of the curve representing P* (or F) concentration with respect to the concentration 
of A. 

From the reactions in Eqs. (2.20), we have that F dynamics obeys to the following ODE 


dF . 
Gps hik + kaP* — ha F. (2.24) 


If we use in Eq. (2.24) the approximation of P* from Eq. (2.23), we obtain that 


dF (xe)" 


=k ——1+_1— Pr — kgF. (2.25) 
ai "T+ 


: : kik 
!0This reaction can also be represented as P —> P+ F. 
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In general, ko Pr is replaced with a different rate constant, k;, whose units are M/s. 
Hence, Eq. (2.25) becomes 


dF (x)" 
— = hig + ktr —4— — ka F (2.26) 
dt 1+ (a) 


2.3.2 The Hill Coefficient x 


The Hill coefficient n is also known as cooperativity index. When n > 1, the binding of 
transcription factor molecules to their target promoter is said to be cooperative. The first 
reaction in Eqs. (2.20) assumes that n molecules of A bind together to P. This is called 
complete cooperativity. However, complete cooperativity is unlikely to happen inside the 
cell. A mechanistic representation of the activator-promoter interactions would demand to 
consider n separate reversible reactions characterized by different binding/unbinding rate 
constants 


(ky1,k-11) 
A+P = P! 
Fi 1 (k12,k—12) 5) 


A Es pri (kKin,k-1n) x 
(2.27) 
where P!,..., P”—! are all “intermediate” inactive promoter states. Moreover, kin, > 
Kin-1 eo. Seiya ki2 > ky (and k_1n < K_tn-1 SS veg S&S k_12 < k_11) i.e. one 


molecule of A bound to the promoter favours the binding of another activator molecule. 
The assumption of complete cooperativity simplifies drastically the scenario depicted 
in Eqs. (2.27). However, experimental values of n are, in general, real numbers and not 
integer as a completely cooperative behavior among transcription factor molecules would 
imply. 


2.3.3 Repressed Promoters 


A system symmetric to the one in Fig. 2.3 is made of a promoter that is active (P*) in 
its ground state and switches to the inactive configuration (P) after binding a repressor 
protein (R) whose concentration is supposed to be constant. When active, the promoter 
leads the synthesis of protein F (see Fig. 2.4). 
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Fig. 2.4 Active promoter P* 
produces the reporter protein F 
in the absence of molecules of 


repressor R Repressors (R) 
a Reporter (F) 


a 


[”  transtation 


Promoter (P*/P) 


We assume complete cooperativity among the repressor molecules such that the system 
reactions are 


. kk) 
nR+P — 
a oe 
k 
F== . (2.28) 


We can consider that the total promoter concentration is a conserved quantity (Pr = 
P + P*). Therefore, P* dynamics is given by the solution of the following ordinary 
differential equation 


dP* 


ai = —kjR"P* +k_,P = —(k,R" + k_1)P* + k_1 Pr. (2.29) 


By setting Eq. (2.29) to zero (steady-state condition), we can write P* concentration as 


k-4 _ I 
kyR"+k4 1+ 2. R" 1+(e)" 


* 


Pr, (2.30) 


where the Hill constant Kj, satisfies, as above, the relation Kj, = k_1/k1. 


The fraction in Eq. (2.30) is the Hill function for repressed promoters. Protein F 
dynamics is associated with the following ODE 


dF 
— = ky P* — kg F = ky 


—— he, (2.31) 
dt 1+ ()" 
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where we set k;, = kz Pr. Notice that, differently from the activated promoter, the leakage- 
dependent production of F is not required in this case since the Hill function is zero only 
for an infinite amount of R.!! 


2.3.4 Modeling the Action of Chemicals on Transcription Factors 


Chemicals are inputs for many synthetic gene circuits. They play a role in regulating both 
transcription and translation by binding either transcription factors or mRNA structures 
such as riboswitches and ribozymes. Following the notation in [31], chemicals are divided 
into two classes depending on their effect on transcription or translation, namely 


¢ inducers (7), i.e. chemicals that activate transcription or translation, 
* corepressors (c), i.e. chemicals that inhibit transcription or translation. 


As for transcription regulation, chemicals bind transcription factor proteins and change 
their wild-type spatial configuration. Inducers bind either active repressors (R“), making 
them unable to bind and repress their target promoters, or inactive activator (A’), turning 
them into an active configuration that has access to the DNA and recruits RNA polymerase 
molecules to initiate transcription. Analogously, corepressors bind inactive repressors (R’) 
and active activators (A®). 

Let us suppose that the repressor in Fig. 2.4 is active in its ground-state configuration 
(R“) and can be bound and deactivated by the inducer, 7. The new system is illustrated in 
Fig. 2.5. We require that 2 molecules of i bind a single repressor protein and nz molecules 
of R® bind the active promoter P*. In both cases, we assume complete cooperativity. 


Fig. 2.5 The active promoter me 


P* produces the reporter 
protein F only in the presence . oo fg 
of molecules of the inducer i 

that bind and inactivate R@ oC (R4) 


proteins a , Reponten | f 


Pranaiation 


PL (P*/P) 


‘1h better fit to experimental data might, however, require an extra reaction for the synthesis of F 
due to promoter leakage. 
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If we neglect both R“ and i production and degradation (i.e. their total concentra- 
tions are constant), the system in Fig.2.5 can be modeled according to the following 
reactions 


nitR®? = R 
n2R*% + P* lie 
p* ©, p*.F 
ee (2.32) 


where 4 and yw are the rate constants of the inducer association with and dissociation 
from repressor molecules, respectively. Inactive repressor R' dynamics depends on the 
first reversible reaction in Eqs. (2.32) that leads to the following ODE 


dRi 
dt 


= AR"! — wR’. (2.33) 


If we use the steady-state condition and call as R¢ = R® + R’ the total amount of free 
repressor at the equilibrium, we obtain from Eq. (2.33) that the concentration of R“ can be 
approximated, with a Hill function, as 


i Ee 
= J = , 
ae Ae Ga" 


(2.34) 


ny 


where, for the Hill constant ky, it holds that ky, = w/d. 
From the analysis of our previous system (in Fig. 2.4) we know that, at steady state, 
active promoter P* concentration has the form of a Hill function too 


P. 

————ee (2.35) 
1+ a 
where Ke = k_\/k,. If we substitute R° into Eq.(2.35) with the formulation in 
Eq. (2.34), we express P* as a function of the concentration of the system input i 

P. 

pre a (2.36) 
Ry 


Kip + my 
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By using the result of Eq. (2.36) into Eq. (2.31) above, we get the ODE corresponding 
to the dynamics of F 


ce k : kqaF (2.37) 
dt — Str RP d ? . 
1+ Sa Fl 
Ky (+ ip) 2 
where, as usual, k;, = k2 Pr. Under steady-state approximation, F’ depends explicitly on 


i and no longer on R“. Equation (2.37) is complicate and this formalism is used seldom. 
The result that F' does not explicitly depend on R“ allows us to reformulate F dynamics 
as a function of the sole inducer ij, the circuit input signal. F dynamics obeys an ODE as 
in Eq. (2.26) where, however, the activator protein is replaced by the inducer chemical 


ha? (2.38) 


Differently from Eq. (2.37), in Eq. (2.38) a term due to leakage production (i = 0) has to 
be taken into account. 

Equation (2.38) can be generalized to describe both transcription activation and 
repression by an input signal s as 


S \n-a 


dF 
—— = otkix + kip —— — ka F , (2.39) 
dt 1+(e)" 


where a = | in case of activation (s = i) and a = 0 in case of repression (s = c). It 
should be stressed that Eqs. (2.38) and (2.39) are valid only if Rf —and hence R®@ — has a 
constant concentration at steady state. 


2.3.5 Approximations on Hill Constant and Coefficient 


In the example above the Hill constant ky, which is expressed in M, gives the concen- 
tration of the input signals (chemicals) necessary to inactivate half of the repressors R@ 
present in the circuit. The formulation of ky we obtained was, however, strictly model- 


og 
dependent. Indeed, by considering (as it should be done) R’ degradation reaction (R’ 4) 
Eq. (2.33) becomes 


dR! 
dt 


= ARI"! — (wt kai)R' , (2.40) 
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which implies a new formulation of ky, namely ky = thas Moreover, inducers can bind 


R® molecules also when they are bound to the promoter. This corresponds to a reaction 
neglected in Eqs. (2.32), namely ni + P > p* 4+ Ri (where, for the sake of simplicity, 
one could set y = 4). This reaction modifies R! dynamics (therefore, the value of k7) and 
contains the Hill cooperativity coefficient as well. Taken together, experimental values of 
both Hill coefficient and constant lump the effects of a multitude of reactions whose rate 
constants are difficult to estimate. This practical limitation in measuring kinetic parameter 
values is one of the major obstacles in the development of reliable fully mechanistic 
models. 


2.3.6 Estimating Hill Constant and Coefficient from Experimental Data 


Let us consider a system where an inactive repressor binds a promoter and suppress the 
synthesis of the system output, the reporter protein F’, only in the presence of an input 
signal S (a corepressor). Let us also suppose that, in our lab, we measure the concentration 
of F for different concentrations of the chemical S and trace a so-called “dose-response” 
curve as in Fig. 2.6. 

We shall grow several cultures of our synthetic cells, treat them with different 
concentrations of S, and finally measure F’ concentration in each cell culture once they 
have reached the steady state. In the case of a repressed promoter, we have 


1 
0 = ky ——\—_, _— ka F. (2.41) 


1+ (4) 


Fig. 2.6 Dose response curve. F (M)+ 
The concentration of the 

reporter protein F’,, the circuit 

output, is plotted with respect F 
to the concentration of the ° 
circuit input, the corepressor 
chemical S. A typical sigmoid : 
behavior, due to cooperativity ° 
effects, is here shown. The 
maximum value of F (Fingx) 
corresponds to the absence of : 
chemicals in the system, 
whereas basal expression of F 
(Finin) is due to a high 
concentration of S$ 


min.. 
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After setting 6 = ae, Eq. (2.41) is rearranged as 


FL 1 
PGR) 
B sy’ 
i) 

= Ff Ss n 


Equation (2.42) can be linearized by calculating the logarithm of its two members 


log (=) = nlog(S) —nlog Ky. (2.43) 


In a plane where y = log (5*) and x = log(S), the function in Eq. (2.43) is a line, 
with slope n, that crosses the y axis at —n log( Ky), as depicted in Fig. 2.7. 

Values for n and Ky can be determined by fitting the experimental data gathered in 
the lab (we will describe this procedure in Chaps. 10 and 11) to the model in Eq. (2.43). 
However, within this framework there are still two issues. F is supposed to be a protein 
concentration whose quantification might be tricky. Furthermore, one should know the 
value of 6. If, on one hand, kg can be determined rather precisely, on the other hand k;, is 
difficult to measure. 


Fig. 2.7 Graphical log[(B-F)/F] + 
representation of Eq. (2.43) 


log(K) 


log(S) 
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To circumvent these hurdles, wet-lab data are normally fitted against slightly different 
functions that are referred to as empirical Hill functions [40]. Moreover, a common read- 
out for synthetic circuits is fluorescence rather than protein concentrations. 

If we call Foy; the (general) read-out of our repressed promoter and indicate with Finin 
the output due to the promoter leakage only (high chemical concentration in Fig. 2.6) and 
Finax the output in the presence of full promoter activity (S$ = 0 in Fig. 2.6), the empirical 
Hill function will look like 


1 
I+ (5)° 


In case of a repressed promoter, K 7 is also referred to as C50 (half maximal inhibitory 
concentration). As for activated promoters, K y is, in contrast, termed EC50 (half maximal 
effective concentration) and the empirical Hill function is written as 


Fout = Finin + (Finax Finin) (2.44) 


1 
a: 
ta (22) 


Fout = Finin + (Finax Fin) (2.45) 


Overall, the parameter n and Ky are estimated in order to reproduce, within a good 
accuracy, the sigmoidal behavior of F as a function of S, characteristic of a dose- 
response curve. This explains why models based on Hill functions are referred to as 
phenomenological models. 


2.4 Michaelis-Menten Kinetics (Enzymatic Reactions) 


Michaelis-Menten kinetics is generally used to represent enzymatic reactions.!* In its 
simplest form, the Michaelis-Menten interaction scheme corresponds to 


(k1,k-1) ko 
+S = [ES] — E+P (2.46) 


where EF stands for the enzyme, S for the substrate, P is the product of the enzymatic 
reaction, and [E'S] represents the complex made by the enzyme bound to the substrate. 
According to Eq. (2.46), the enzyme EF converts molecules of the substrate S$ into 


!2As we will see in Chap.9, a Michaelis-Menten scheme is adopted in the Parts & Pools 
framework [32] to describe the interactions between RNA polymerases and promoters, which lead 
to transcription initiation, and the reactions between ribosomes and the mRNA, which determine 
translation initiation. 
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molecules of the product P. The rate of this reaction is defined as the quantity of P 
generated per unit of time, namely 


a 2.47 
=a (2.47) 
Calculation of v via mass action kinetics would require to solve (only computationally) a 
system of four ODEs. However, when the binding reaction of the enzyme to the substrate 
(rate constant: k,) is much faster than both the dissociation of the enzyme from substrate 
(k_1) and the conversion of the complex [ES] into the product P with the release of free 
enzyme molecules (kz), [ES] changes so slowly over time that it can be treated as if it 
were at the equilibrium.!* This is the so-called quasi-steady-state condition (this concept 
will be further explained in Chap. 10) and leads to write that 


d{ES] 
0= —— =hE-S~ (ki +kESI. (2.48) 


If we assume that the total enzyme concentration stays constant over time (Ey = E + 
[E'S]), from Eq. (2.48) we obtain that 


k,Er-S Er-S 
[Es] = ——* = 7? (2.49) 
kj t+kh+k,S S+Ky 
where Ky = + is the Michaelis-Menten constant. Since, from Eqs. (2.46), we have 
that 
a ko[ ES] (2.50) 
dt —— 2 ? * 
the reaction speed is given by 
S S 
= k,Er —— = ———_., 2.51 
v 2 taka weet Fa Re ( ) 


where Unax = k2Er7 is the maximal reaction rate. Equation (2.51) points out that Ky 
corresponds to the substrate concentration necessary to have v = Umax /2 (see Fig. 2.8). 
If all molecules of the enzyme are bound to the substrate, the enzyme is said to work at 
saturation and Eq. (2.51) simply becomes v = Umax. 

Michaelis-Menten kinetics, as expressed in Eq. (2.51), corresponds to a Hill function 
with n = 1 i.e. without cooperativity effects. Experimental techniques (from calorimetry 
to spectrometry) exist to measure v for different values of S. Hence, from a data-fitting 


'3-The initial concentration of S$ should also be much bigger than that of E. 


2.4 Michaelis-Menten Kinetics (Enzymatic Reactions) 27 


Fig. 2.8 Enzymatic reaction 
rate according to 
Michaelis-Menten kinetics 


Fig. 2.9 Graphical IIv 4 
representation of Eq. (2.52). 
The inverse of S is expressed in 
M7, whereas the reciprocal 
of v is expressed in s/M 


slope: KyVmax 


s 


1S 


-IKy 


procedure one can estimate both vq, and Ky. Equation (2.51) is rewritten as 


vo 1 
Umax 1+ Au 
Umax Ku 

1 ai 

v = S 


1 1 Ku 1 
= Epa! (2.52) 


v Umax Umax S 


The graphical representation of Eq. (2.52) in the plane ¢, 3) — see Fig. 2.9 — is a line 
that crosses the x axis at —1/Kyy and the y axis at 1/vmgx. The line slope is given by 
Km /Umax- Equation (2.52), however, demands to express the reciprocal of v with respect 
to the inverse of S: this is error-prone when dealing with small changes at high substrate 


concentrations. 
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Take Home Message 

e« A mathematical model of a synthetic gene circuit must be predictive i.e. it shall be able 
to foresee how the circuit dynamics changes by varying the initial conditions (species 
concentrations and kinetic parameter values) of the circuit. 

¢ Ordinary differential equations that describe how species concentrations change over 
time are strictly connected to the fluxes of the bio-chemical reactions. 

e Mass action kinetics allows the definition of very detailed (mechanistic) models. 
However, it demands the knowledge of many kinetic parameters that are not easy to 
be measured. 

e Hill functions simplify the representation of certain interactions such as promoter 
activation and repression. They are characterized essentially by two parameters termed 
Hill (cooperativity) coefficient and Hill constant. Both can be estimated rather easily 
from experimental data. 

¢ Michaelis-Menten kinetics is a description of enzymatic processes. Essentially, it 
corresponds to a Hill function in the absence of cooperativity. 


Check for 
updates 


What You Will Learn in This Chapter 

The dynamics of a synthetic gene circuit is obtained by solving a system of ordinary 
differential equations. However, even simple ODE systems can be too difficult to be 
tackled analytically. Hence, you shall recur to numerical methods. In this Chapter we 
will show how to use the software COPASI to write a model for a biological system 
and compute the dynamics of it. In particular, we will consider some of the models we 
derived in Chap. 2. You will learn how to build and simulate a model based on mass action 
kinetics and Hill functions. Moreover, useful features provided by COPASI — such as the 
definition of a report and the “parameter scan” function — are here described. We have 
chosen COPASI because it is open-source and easy to use thanks to an intuitive graphical 
user interface. 


3.1 Modeling with COPASI 


The first synthetic system we consider for our computational analysis and simulations is 
the same as in Eqs. (2.1). It is made of four species (the inactive promoter P, the active 
promoter P*, the monomer M, and the dimer D) and seven reactions, namely 


p 1, p* 


k_ 
p* —> Pp 
« ko 
P" —> P+M 
ry 
2M — D 
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Table 3.1 Rate constant 
values [33] for the reactions in 


Rate constant | Value Unit 


‘5 —1 
Egs. (3.1) ky 10 s ! 
ke 0.01 |s7 
ko 0.5 sl 
5 10° M7!s7! 
€ 10 Ss” 
kam 0.00116| s~! 
kip 0.00116 | s~! 
D—>2M 
k 
mM —4 
k, 
p= .. (3.1) 


The four ODEs that determine the system dynamics are listed in Eq. (2.6). They can be 
solved numerically with the software COPASI [24] that is freely available at http://www. 
copasi.org.! 

After opening the COPASI graphical user interface, in order to write the reactions in 
Egs. (3.1), open the branch “Model” and then “Biochemical”. Click on “Reactions”, in 
the navigation panel on the left, and then on the “New” button. A new reaction (named 
“reaction_1’’) is created. Double click on it and the corresponding definition panel appears. 
Let us insert the first reaction in Eqs. (3.1). Since its rate constant is kj, we will call this 
reaction “k1”. Change the reaction name from “reaction_1” to “k1”’. In the pink bar below, 
write “ P -> P* ” (pay attention that each symbol — also the stoichiometric coefficients, 
when present — should be separated from the ones nearby by a blank space). Click 
on “Commit” button. Change, in the “Symbol Definition” area, the value of k; to “le0S” 
s~! as in Table 3.1 (the units are given by COPASI automatically. Notice that the kinetic 
rate constant of each reaction is called k1 by COPASI but this attribute has nothing 
to do with the reaction name). Click on “Commit” again. The first reaction has been 
inserted. Repeat the procedure for the other six reactions in the model. Here, we treat all 
reactions as irreversible. The reason is that, as we will see later, only irreversible reactions 
are accepted by a solver for stochastic simulations. 

While inserting the reactions, COPASI automatically updates the list of species and 
parameters in the system. Once all reactions have been inserted, the branch “Species” will 
display four elements. By clicking on each of them a new panel is opened. Here, it is 
possible to set the species initial concentration. At the very beginning of our simulations 
only one species should have a concentration different from 0, namely P. P initial 


'Here we use COPASI 4.16 (Build 104) on a Mac computer — OS X Yosemite. 
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concentration is set to 10~° M that, as we have seen in Chap. 2, corresponds to a single 
DNA molecule in a volume of 1.66 10~!> | (an approximation of the E. coli cell volume). 
Click on P and write “le-09” as initial concentration. Click on “Commit”. Set all the 
other initial concentrations to 0. In order to set the compartment volume (the whole cell, 
in our case) to the value of 1.66 10~!* ml, click on the branch “Compartments” and then on 
“compartment” (this is just a name like “reaction_1” above). Write “1.66e-12” as “Initial 
Volume (ml)” and then click on “Commit”. 

Notice that, by clicking on “Parameter Overview’, you can see (and change) the initial 
species concentrations, the rate constant values, and the compartment volume. 

Before starting with the system simulations, we have to define a plot. Open the branch 
“Output Specifications”, click on “Plots” and then on the “New” button: “plot_1” is 
created. Double click on “plot_1” to open the corresponding definition panel. Click on 
the “New Curve” button. In the new window that appears on the screen open, under “X 
Axis:”, the branch “Time” and select “Model Time”. Under “Y Axis:” open the branch 
“Species” then “Transient Concentrations” and select all species in the system. Click on 
the “OK” button. Back to “plot_1” panel, click on “Commit”. 

In order to run circuit simulations, open the branch “Tasks” and then click on “Time 
Course”.* Set “Duration (s)” to 10,000s and “Interval Size (s)” to 1s. Open the curtain 
menu “Method” and choose “Deterministic (LSODA)”.* Click on the “Run” button. A new 
window corresponding to “plot_1” appears and shows how the concentrations of the four 
circuit species vary over the selected time interval (see Fig. 3.1). At 10,000 the circuit has 
already reached its steady state (equilibrium) and the concentration of each species stays 
constant. 

In Chap. 2 we calculated that at steady state there are about 44 monomers, 193 dimers, 
and P* ~ Pr. To check if our estimations were correct, we can either look at the 
“Result” under the branch “Time Course” or go to another branch of “Tasks”, namely 
“Steady-State”. In the “Steady-State” definition panel, uncheck “calculate Jacobian” and 
click on “Run”. In the “Steady-State Result” panel, change “Concentrations” to “Particle 
Numbers’, you should see the same values as in Fig. 3.2. They confirm that our previous 
calculations were correct. 


2By default COPASI uses millimoles (mmol) to express the species amount and ml to quantify a 
volume. These units can be changed by clicking on the branch “Model”. In the corresponding panel, 
on the right, you can set “Volume Unit” to “1” and “Quantity Unit” to “mol”. 

31f you open this branch and click on “Result” you can see the numerical values of species 
concentrations over time. You can always switch from concentrations to particle numbers by clicking 
on the corresponding button in the toolbar. 

4In order to see the ODEs associated with the model, open (in the navigation panel) the branches 
“Model” and “Mathematical” then click on “Differential Equations”. 
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Fig. 3.1 Species dynamics simulation 
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Fig. 3.2. Number of molecules at steady state 


3.2. Adding RNA Polymerase to the System 


In order to make the model more detailed, we can consider the presence of RNA 
polymerase (RN Ap) explicitly. The reactions associated with the system become 


k 
RNAp+ P —> P* 
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pt =. pi RNAp 
pt”, p4M4+RNAp 
2M —> D 

D—>2M 


kam 


M — 


kab 


ps (3.2) 


Following the results in Chap. 2, let us set RN Ap initial concentration to 1 M. This 
value is huge and can be taken as a good approximation of an infinite amount. Therefore, 
we expect that it will not affect the species concentrations at steady-state as they were 
calculated for the system in Eqs. (3.1). Indeed, if you run the task “Steady-State”, you will 
not see any change with respect to the values in Fig. 3.2. 

By following Eq. (2.18), let us set RN Ap initial concentration to y = 5.110~°M. 
Since RNAps; ~*~ RNApr, we expect that only half of the available promoters gets 
activated. Numerically, we have that RN Aps; = 5.0995 10-°M, PX = 0.499975 10-°M 
and only about 30 monomers and 92 dimers are present at the equilibrium. Finally, let us 
set RNA polymerase initial concentration to 2.1 10~° M. From our theoretical analysis, 
this amount of RN Ap should be able to activate much less than half of the total promoter 
amount. Indeed, we have that P*, ~ 0.29 10~? M. This leads to the synthesis of roughly 
23 monomers and 51 dimers. Overall, this analysis points out that about 2100 molecules 
of RNA polymerases can not be considered as an infinite quantity with respect to a single 
promoter molecule (when using the parameter values in Table 3.1). 

In Chap. 2 we have seen that a similar analysis can be performed, for instance, with 
respect to the kinetic parameter kj, i.e. the RNA polymerase-promoter binding rate 
constant. Here, we will calculate how P*, changes with respect to different values of ky. 
To this aim, first create a new plot (“plot_2”) where only P* concentration is drawn for 
different time points. Click, then, on “Parameter Scan” under the branch “Tasks”. In the 
“Parameter Scan” definition panel verify that the “Task” is “Time course” then click on 
“Create”. In the new window that pops up, select “Reaction”, “Reaction Parameters”, “k1”’, 
and finally “k1”. Click on “OK”. Back to the definition panel, let the number of interval be 
equal to “10” and change “min” to “1le04” and “max” to “1le07”. Click on “Run’’. “Plot_2” 
shows how P;. approaches Pr with increasing values of kj. Moreover, on “plot_l” one 
can visualize how dimer and monomer dynamics are affected by different choices of k1 
(see Fig. 3.3). 

In order to use the results of a “Parameter Scan” it is recommendable to define a Report. 
Click on “Report” and a small window pops up. Click on the symbol beside “Report 
Template”, a new definition panel appears. Set “Name” to “kl scan” and keep “Steady- 
State” as a “Task”. Click on “Item”. In the new window select “Reactions, Reaction 
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Fig. 3.3 Parameter scan. (a) P* concentration for increasing values of k,. (b) Dimer concentration 
at different k, values 


Parameters, k1, k1”, then click on “OK”. Click on “Separator”. Click on “Item” again. 
This time select “Species, Transient Concentrations, [P*](t)” and click on “OK”. Click 
on “Commit”. The new report “kl scan” appears under “Output Specifications, Reports” 
in the navigation panel. Go back to the “Parameter Scan”. Click on “Report”: “Report 
Template” should be set to “k1 scan”. Click on the symbol beside “Filename” and choose 
a file where to write the report (uncheck “Append” but keep “Confirm Overwriting” 
selected). Click on “OK”. The “Task” in the “Parameter Scan” definition panel is now 
“Steady State”. Click on “Run”. The top part of the report file contains on each line the 
value of ky followed by P* concentration. For kj > 810°M~!s7! the active promoter 
concentration at steady state is greater 0.97 10? M. To make the parameter scan analysis 
more complete, modify “kl scan” report by inserting a further separator and then, as a 
new item, select “Species, Transient Particle Numbers, D(t)’. In the “Parameter Scan” 
definition panel set “min” to “le05” (our value in Table 3.1) and max to “1le08”. Click on 
“Run”. From the new report we can see that P* > 0.9910~°M for ki > 310’M~!s7! 
and the number of dimers at steady state (between 192 and 193) does not change 
considerably for k; € [107, 108] M~! s~!. 


3.3. Activated Promoter 35 


3.3 Activated Promoter 


Another simple system that we considered in Chap. 2 is made of a promoter that is inactive 
(P) in its ground state and gets activated (P*) by the binding of n activator proteins (A). 
P* is able to transcribe a reporter protein, F. If we suppose that n = 2 and A is neither 
produced nor degraded, the system reactions are 


2A +P 
P* 


P* 


P 
ka 
F— . (3.3) 


In order to simulate this system, we can make use of the rate constant values in 
Table 3.2. P initial concentration is equal to 10~° M; A initial concentration is 1.5 10~°M 
(about 1500 molecules). Under these initial conditions, P gets moderately activated 
(Px ~ 0.18 10° M) and, at steady state, we have about 79 reporter proteins. 

As explained in Chap. 2, a mechanistic model as in Eqs. (3.3) is generally replaced 
by a phenomenological model based on Hill functions. A Hill function lumps all the 
interactions between activators and promoter. Moreover, the knowledge of P and P* 
concentration is no longer necessary to determine the dynamics of F’. Equations (3.3) 
are rewritten as 


kt H(n=2,Ky,A) 
— F 
kp 
— 


eg << (3.4) 


Here, kj, = koPr = 0.510-?Ms7! and ky = ky Pr = 107'*Ms—!. H(n = 
2, Ky, A) represents a Hill function that depends on the concentration of A and the 
parameters n and Ky — see Eq. (2.25). With respect to the model in Eqs. (3.3), which 


Table 3.2 Rate constant 
values for the system in 


Rate constant | Value Unit 


9 =2e=1 
Eqs. (3.3) i " — 
ky 0.01 | s~ 
ko 0.5 sg! 
ki 10-5 | s7! 


ka 0.00116} s—! 
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obeys to mass action kinetics, the new model in Eqs. (3.4) reduces the species number 
from 4 to 2 and replaces the parameter k, and k_, with the Hill constant and coefficient 
that, as shown in Chap. 2, are determined from experimental data. 

According to the reactions in Eqs. (3.4), the ODE that gives the dynamics of F is 


A)2 
dF Kn 
= =hkot+ —kaF. 3.5 
ap ee (A? d (3.5) 


In our system we have set n = 2. In order to obtain a value for Ky that is connected 
to the model in Eqs. (3.3), we can apply the steady-state condition to Eq. (3.5) and use the 
value of F;; = 7.9 10-8 M that we have found by simulating Eqs. (3.3) with COPASI. 

After setting Eq. (3.5) equal to zero, we have that 


A ~6 
Ke ~ 3.1610-°M (3.6) 
a 
where 
ka Fig — & 
ga ee pe (3.7) 
kif 


How to write into COPASI the reactions in Eqs. (3.4) such that they generate the same 
ODE as in Eq. (3.5)? 

The second and the third reaction in Eqs. (3.4) (let us call them “kb” and “kd”, 
respectively) do not present any particular novelties with respect to all the other reactions 
we have met so far. In contrast, the first one (“‘ktr’) does not follow mass action kinetics. 
Within COPASI it becomes “A -> A + F” (which simply means that F is produced by 
A with no change in A concentration) and has to be associated with a Hill-function- 
based kinetics. As we will see below, it is convenient to treat both n and Ky as “Global 
quantities”. To this aim, click on “Global quantities”, “New”, and then “quantity_1’’. Call 
it “n_hill” and write “2” in the “Initial value” bar. Click on “Commit”. Repeat the same 
procedure for Ky (call it “k_hill’”) and set it to “3.16e-06” molars. As above, set the 
compartment volume to 1.66 10~!? ml and F initial concentration to 0. Create the three 
system reactions. As mentioned above, “ktr’” requires the expression “A -> A + P”. Click 
on “New Rate Law”. A new panel appears. On the “Function” bar type a name for the new 
kinetics, for instance “Hill activation” (the default name is “Rate Law for ktr’’). In the panel 
“Formula” write “k*(A/kh)An/(1+(A/kh)An)”.° “Function Type” should be “general” and, 
in the “Parameters” panel, choose “Substrate” as “Description” for “A’, whereas “k’’, “kh’’, 


5 This formula can be re-used for other reactions, if necessary. In order to check its correctness, click 
the button with the symbol ./x on the right of the panel. 
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Fig. 3.4 Reporter protein dynamics: comparison between different kinetics. Promoter activation 
appears faster by using a Hill function. The time the system needs to reach F's55/2 corresponds to 
about 683s in the mass action kinetics framework, whereas it is over 100s shorter (576s) when a 
Hill function is adopted 


and “n” are set by default as “Parameter’’. Click on “Commit”. Set the concentration of A to 
“1.5e-06” molars. Go back to the reaction panel and click on “ktr’”. Open the curtain menu 
“Rate Law” and choose “Hill activation”. In the “Symbol Definition” panel set the value 
of “k” to “0.5e-09” M/s, and assign “kh” and “n” to “k_hill” and “n_hill”, respectively, 
within the “Mapping” column. Click on “Commit”. Run a “Steady-State” task. The value 
of F;; is about 79 molecules, as in the mechanistic model. Although the two models give 
the same concentration of F at steady state, the dynamics of F is slightly different in the 
two theoretical representations (see Fig. 3.4). By using Hill functions you do not need to 
know in details the promoter-activator interactions since they are “contained” into the Hill 
constant and coefficient that, as we mentioned above, are determined experimentally. 
Repressed promoters are modeled in a similar way. Their Hill function is just slightly 
different and a reaction for the basal production (leakage) is not strictly necessary. 


3.3.1 Chemicals as Input Signals 


Let us consider Eq. (2.38) where the production of a reporter protein F is regulated by a 
chemical, the inducer i. 

In a wet-lab experiment, one lets the system reach a first equilibrium in the absence 
of inducers. Here, the level of F is due to the sole basal production, hence it is very 
low. At this point, a certain amount of i is injected into the system. As a consequence, F 
concentration increases until it reaches a new steady state. How to mimic this experimental 
procedure with COPASI? 
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The system reactions are analogous to those in Eqs. (3.4) with the only difference that 
the Hill function depends on the concentration of i i.e. H(n, Ky,i). We can follow the 
same instructions illustrated above for the activated promoter and write a new model in 
COPASI. We have to turn A into i and set its initial concentration to 0. Let us keep the 
same values for k;, and kg but increase ky, to 10—!2 M/s. Moreover, let us consider n = 2 
and Ky = 10~’M. At steady state the concentration of F is only about 8.62 107!° M (less 
than one molecule). Let us select “Tasks, Time Course”, check “update model’, and click 
on “Run”. Uncheck “update model” and select “Parameter Overview” in the navigation 
panel. The initial concentration of F has been updated to its steady-state value. Change i 
concentration to 10~> M and run a “Steady-State” task. At the new system equilibrium F 
concentration is high, around 4.31 10~’ M (i.e. almost 432 proteins). Therefore, by using 
the “update model” option and changing the inducer concentration after a proper time 
interval, we have reproduced in silico the same procedure we follow in the lab. 


Take Home Message 

¢ Models based on mass action kinetics are written in COPASI by simply entering the 
species initial concentration, the cell volume, and the bio-chemical reactions together 
with the value of their rate constants. 

e Hill functions — and other kinetics — require the definition of new rate laws. Reactions 
that make use of these kinetics shall be then formulated accordingly. 


| 
Check for 
updates | 


What You Will Learn in This Chapter 

A deterministic modeling approach — as the one explained in Chap. 2-assumes to deal 
with systems isolated from any source of noise, or randomness, that could spoil model 
predictions. In classical physics this approximation is generally valid. Biological systems, 
however, are intrinsically noisy, which makes deterministic models inappropriate to 
simulate, at least, certain kinds of synthetic gene circuits. The stochastic simulation 
(or Gillespie) algorithm is a purely computational approach to calculate the temporal 
evolution of biological systems. In principle, it may give a more realistic descrip- 
tion of gene circuit dynamics. However, it becomes computationally too demanding 
if applied to large synthetic networks. In this Chapter, we will explain the theoretical 
foundations, working, and recent improvements of this algorithm. Moreover, we will 
discuss under which conditions a deterministic model can provide faithful predictions, 
despite the complexity of the system, and when, in contrast, stochastic simulations 
are preferable. 


4.1 Stochastic Systems 


A deterministic model gives always the same predictions if the system initial conditions 
(parameter values and initial species concentrations) do not change. This framework works 
well for a system where the noise is negligible i.e. the system is isolated from any possible 
source of disturbance and perturbations. This can be considered as true if you study the 
motion of a solid body, such as a stone, falling from different heights. However, if instead 
of a stone you use a piece of paper and the place where you conduct the experiments 
is subject to frequent changes in wind speed and direction, every time you let the paper 
fall (from the same height), it will trace a different path in the air and land, at different 
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times, on a diverse spot on the ground. Such a motion is called stochastic i.e. the trajectory 
designed by the paper is due to various events that happen randomly. Therefore, even if 
your deterministic equations take into account the effect of the friction of the air, they will 
nevertheless fail to predict the system dynamics correctly and you will have to rely on 
some probability laws that can tell you only where the paper is likely to be at each instant 
of its fall. 

A classical example of stochastic processes is the random walk. In its easiest formula- 
tion one can consider a molecule that moves along a line. At each time step (discrete time), 
the molecule can jump one position to the right (discrete space) with a certain probability 
Pr, One position to the left with probability p;, or keep still (no motion) with probability 
Po = 1-— p; — py (see Fig. 4.1). 

Each simulation will give a different path even if every time we place the molecule, at 
t = 0, at the same position x. Is there anything we can predict in a stochastic system? For 
instance, how far from the initial position can we expect to find the molecule after N time 
steps? 

If, for simplicity, we set py = pj = 5 and assign | to a jump to the right, —1 to a jump 
to the left, and 0 to no movement, the average distance ((s)) covered at each time step is 


(s) =1-p-—1-pr+0- po= =0, (4.1) 


which is of no use. However, the average of the square of s has a non-negative values 


eae eae (4.2) 
3° 3 3 
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If the whole walk is made of N = 3 steps, the square of the total distance s7 covered by a 
molecule will be on average 


(st) = ((s1 +82 +53)?) = (st) + (83) + (83) + 2(sis2) + 2(s193) + 2(s283). (4.3) 


All products in Eq. (4.3) average out to zero and we remain with 


(s2) = (92) + (93) + (8?) =3- = =2. (4.4) 


The result in Eq. (4.4) can be generalized as 


2, 2 Re 
(sp) = 3N => (sr) © aN. (4.5) 


Therefore, after N steps we can expect, on average, that a molecule will be approximately 


J3N positions away from its starting location. Though not precise,! this is a hint on the 
system behavior. 

Stochastic processes are common in biology (for example, think about the thermal 
motion of molecules). Gene expression, which is at the basis of synthetic gene circuits, 
shows random fluctuations as well. In general, we distinguish between two kinds of noise: 
intrinsic (due to the system itself) and extrinsic (due to random processes acting on the 
system). Transcription is stochastic since, for instance, initiation can have success or be 
aborted, RNA polymerases move discontinuously along the DNA chain and even stop, etc. 
Moreover, promoters can be activated/repressed with different strength depending on the 
transcription factor amount that is determined, in its turn, by a stochastic expression. On 
the whole, circuit performance shows a clear cell-to-cell variation within a population of 
re-engineered cells. The overall noise in the system is quantified with the coefficient of 
variation (Cy) 


Cy = —., (4.6) 


where x is a quantity measured in the lab (e.g. fluorescence) and o(x) is the standard 
deviation of x.2 

Elowitz et al. [14] used the notation 7;o; for the coefficient of variation in Eq. (4.6) 
i.e. the total noise in a system. In order to quantify the intrinsic (7jn;) and the extrinsic 
(Next) Component of 7707, they inserted into E. coli cells two (almost) identical genes. 


Tn general, (sp)? # (ee): 


2The standard deviation of x — over a set of N data — is calculated as: o (x) = J eA (xj - (x))2. 
It represents the average error associated with each measurement. 
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Fig. 4.2 Circuit scheme for Extrinsic noise 
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Fig. 4.3 Intrinsic and extrinsic 
noise over a cell population 


= 
a= ‘ 
2 ra Total 

° noise 
i ce lee 

e e 
ee ° Pee. ate 
& - ° Intrinsic Extrinsic 
. ° : : 
is é noise noise 


CFP intensity 


One expressed the cyan (CFP) allele of the green fluorescent protein (GFP), the other 
gene expressed the yellow (YFP) allele (see Fig. 4.2). The two genes were transcribed by 
the same kind of promoter (regulated by Lacl). nin¢ was measured as the mean relative 
difference in fluorescence intensity of CFP and YFP within the same cell.* Since it holds 


that 72, = eg + 72,,, the extrinsic noise was calculated as nex: = ,/ Cy = ies (see 
Fig. 4.3). In the absence of njn;, the expression of the two fluorescent proteins would be 
fully correlated (showing the same fluctuations to 7¢x;) and the cells would look yellow 
at the microscope, whereas 7;n; > 0, corresponding to differences in the intensity of CFP 
and YFP, would cause the cells to have diverse colors from red to green. 


31f we call C F; and YF; the cyan and yellow fluorescence measured from the ith cell over a 
population of N samples, the average intrinsic noise corresponds to 


iat 
Nine = 5 DICK — VFL. 


i=1 
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The promoters leading the expression of the two fluorescent proteins were down- 
regulated by the LacI repressor. The intensity of transcriptional repression was modulated 
by using various concentrations of IPTG (isopropyl-6-D-thiogalactopyranoside), a chem- 
ical that binds LacI and prevents it from DNA binding. This allowed studying how noise 
varied as a function of the promoter transcription initiation rate. 

Elowitz et al. showed that 7j;,; becomes much higher when the transcription rate 
is lowered i.e. when there are less fluorescent proteins in the system. This means 
that variations weigh more when a system is made of a small number of molecules. 
Furthermore, transcription regulation via LacI repressor gave a clear contribution to the 
extrinsic noise as well, underlining the fact that the cells in the population were producing 
diverse amount of Lacl. 

Finally, a particular variant of this system was realized by expressing LacI via a 
Repressilator circuit [13].4 This configuration let register the highest values for no; and 
put in evidence how, in a gene circuit, the noise is higher during a transient phase rather 
than at a stable steady state. 


4.2 The Chemical Master Equation (CME) 


The Chemical Master Equation (CME) permits to calculate the probability that a system 
will be in the state x at time f given that it was in the state x at time fo (initial conditions). 
This conditional probability, denoted as P(x, t|xo, fo), is a key concept in the study of 
stochastic systems [23]. 


4.2.1 CME Derivation 


A system contains N distinct chemical species (S),..., Sy) that interact through M 

different reactions (Rj,..., Ry). The system is supposed to be well-stirred (spatially 

homogeneous). We denote its volume as V. At any time f the state of the system is given 

by the vector x = (x1,..., xy), where x; is the number of molecules of the species S;. 
Each reaction R;, j = 1,..., M is characterized by two quantities: 


¢ the state-change vector vj = (11;,.-.., aj), where vj; represents the change in x; 
when kj; occurs. If the system state is x and Rj; takes place, then the new system 
state will be x + v;. Notice that the stoichiometry matrix N has M columns each 
corresponding to a different vt, 

¢ the propensity function a; (x). Given the system in x at a certain time f, aj(x)dt 
represents the probability that reaction R; takes place in the interval [t, t + dr). 


“The concentration of Lacl in the Repressilator oscillates over time — see Chap. 6. The Repressilator 
was inserted into lacI~ cells. 
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What is the probability that the system is in the state x at time ¢ + dt? This condition is 
verified if 


System at time f | Reaction| System at time t + dt 
x none x 


x— vj Rj x 


In all the other possible cases the system does not lie in the state x at time t + dt. The 
probability that no reaction occurs in the time interval dt given that the system is in the 
state x at time ¢ corresponds to: | — ee , a; (x)dt. If we require that the system is in the 
state xg at t = fo, the conditional probability of finding the system in the state x at time 
t + dt is given by 


M M 


P(x, t+dt|Xo, to) = P(x, t|xo, t0)(1— )) aj(x)dt)+) > P(x—vj, t|Xo, to)aj(K—v,)dt 
j=l j=! 


M 
= P(x, t|xo, t0) — )> P(x, t|xo, fo)aj(xdt 
j=l 


M 
aie ~~ P(x — vj, t|Xo, to)aj(x — vj)dt. 
j=l 


(4.7) 


Equation (4.7) can be rewritten as 


P(x, t+dt|xo, fo) — P(x, t|Xo, to) _ 
dt = 


M 
YS [PK—9j, t1xo, t0)aj(X—vj)—P (x, t Xo, 10a; ()] 
j=l 

(4.8) 
that, as dt approaches zero, becomes 


M 
OP Cx thx.) _ 


ay [P(x — vj, t|Xo, to)aj(x — vj) — P(X, t|xo, to)a;(x)]. (4.9) 
j=l 


Equation (4.9) is the Chemical Master Equation. In principle, the probability that a system 
reaches the state x at time t is calculated by integrating Eq. (4.9).° However, due to its 
complexity, both analytical and computational solutions of the CME are too difficult. As an 


5In a deterministic framework, the initial conditions (xq, fg) determine unequivocally the ODE 
solutions. In a stochastic framework, (xq, fg) determine unequivocally the probability that the system 
lies in x at time fr. 
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alternative strategy, computational algorithms have been developed to design the trajectory 
x(t) of a stochastic system in the phase space.° This permits to simulate if the system under 
study will end, at a given time, into a particular configuration. 


4.3. The Stochastic Simulation Algorithm 


The Stochastic Simulation Algorithm (SSA) was published by D.T. Gillespie in 1976 
[22] and represents a milestone in the study of stochastic systems. As stated above, 
this algorithm cannot compute P(x, f|xo, fo) and solve the CME but it provides a way 
to determine the temporal evolution of a stochastic system, x(t). The main quantity 
that permits to calculate x(t) is a probability function referred to as p(t, j|x, t)dt. It 
corresponds to the probability that, given the system in the state x at time ¢, the next 
reaction will be R; and will occur in the interval [t + t,t + t + dt). To be more precise, 
P(t, j|X, f) is a joint probability density function in the two random variables t and j (i.e. 
the time T to the next reaction and the index j of the next reaction). Why is p(t, j|x, t) 
important? If you know it, you can estimate t and determine R;. Hence, p(t, j|Xx, t) is 
essential for drawing the temporal evolution of a stochastic system. 

In order to calculate p(t, j|x,t), let us define Po(t|x, t) as the probability (given the 
system in x at time f) that no reaction occurs in the interval [t, +17). Thus, p(t, j|x, t)dt 
can be written as 


P(t, JIX, dt = Po(t|x, t)a;j(x)dt. (4.10) 
Notice that a;(x)dt in Eq. (4.10) is the probability that R; occurs in [t + t,t + t + dT) 


since the system is still in x at time ¢ + t. The probability that no reaction takes place in 
the interval [t, t + t + dt) — given the system in x at time ¢ — corresponds to 


M 
Po(t + dt|x, t) = Po(t|x, 1) — S > ag(x)dt) 
k=1 
M 
= Po(t|x, t) — Po(t|x, t) > ag (x)dt (4.11) 
k=1 


where | — ae ax (x)dt is the probability that no reaction occurs in the interval [t+t, f+ 
t + dt) because, as stated above, the system is in x at time ¢ + t. Equation (4.11) can be 
rearranged as 


®In the phase space every point corresponds to a different system configuration (parameter values 
and species concentrations). A trajectory in the phase space gives the temporal evolution of a system 
from its initial state to the equilibrium. 
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M 
Po(t + dt|x, t) — Po(t|x, t) 
= —Po(t|x,t) D> ax (x) (4.12) 
dt 
k=1 
that, as dt approaches zero, becomes 
M 
0 Po(t|x, t) 
ee = — P(t |x, t) S > ag (x). (4.13) 
OT 
k=1 
If we set ag(x) = ae ag (x), we have 
dP at 
GOED aie (4.14) 
Po(t|x, t) 
that brings to 
Po(t|x, tf) = eM, (4.15) 


By using into Eq. (4.10) the result in Eq. (4.15), we obtain that 
PG, Js) =ajwe OO". (4.16) 


After multiplying and dividing Eq. (4.16) by ag(x), p(t, j|x, t) becomes the product of 
two probability density functions 


aj (Xx) 
ag (Xx) 


p(t, j|x, t) = ag(xje OT, (4.17) 


Therefore, p(t, j|x, tf) provides us with both the “j probability density function” (2 ~ ys 


which allows to determine the reaction R; that will take place at the next computational 
step, and the “‘t probability density function” (ag(x)e~“™"), which permits to quantify 
the duration of the time interval t over which R; will occur. Every iteration of the SSA 
requires the computation of both j and t. They are calculated — via the so-called inversion 
method — starting from their probability density functions in Eq. (4.17). 

In order to determine a value for t, the SSA draws first a random number r; from the 
uniform distribution between 0 and 1. The inversion method assumes that F(t) = r; (i.e. 
t = F—(r1)), where F(t) isa probability function. In general, F(t) is defined as 


F(t) = [ p(t )dt , (4.18) 
0 
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where p(t) is the “zt probability density function” (it holds, also, that F(0) = 


F (+00) = 1). Since p(t) = ag(x)e~™", the SSA uses the following F(t) 


T , 
F(t) / ag(x)e~)* dr’ 
0 


T 
ee 40(X)T 


= —ao(x) 


ao(x) P 


= e HO )t 


By applying the inversion method (r; = F(t)), t is finally computed as 


n= 1- e a0 )t 
l—rl =e 0 
In(1 — ry) = —ao(x)t 


—In(r1) = ao(x)t 
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O and 


(4.19) 


(4.20) 


where the substitution of 1 — r; with r; is justified by the fact that the two quantities 
are Statistically equivalent. In a similar way, in order to determine j the SSA draws, 
first, another random number r2 from the uniform distribution between O and | and then 


calculates the smallest integer 7 such that 
j 
Ya; (&) > r2a0(x). 
fal 


Overall, the SSA requires the following steps: 


1. initialize the system to the state xo at t = fo, 


(4.21) 


2. being the system in the state x at time ¢, evaluate all propensity functions a;(x) and 
their sum ag(x). Notice that the propensity functions depend on the state x in which the 


system is and, therefore, are constantly changing, 


3. draw r; and rz and calculate j and t i.e. which reaction Rj; occurs within the time 


interval tT, 


4. change the state of the system from x to x + v; and increase the time from f to t + T, 


5. go to Step 2 (and repeat the procedure) or end the simulation. 


At every step of the algorithm, a time interval t is computed and a single reaction is 
fired. This provokes a change in the number of molecules of one or more species. Though 
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precise, the SSA algorithm can be very slow. This is due, mostly, to the term —\ since 


ag(x) can be very large, which implies a short t. This problem arises, for ‘aience. when 
some species are present in a great number of molecules (high x; — see below). Generally 
speaking, every algorithm that computes a single reaction at each time increment has the 
drawback of being too slow to simulate highly complex synthetic gene circuits, in spite of 


the accuracy of its predictions. 


4.3.1 AnExample 


The dimer-generation system in Eq. (2.1) is made of four species S;, i = 1, ..., 4 (whose 
number of molecules is indicated with x;, i = 1,...,4) and seven reactions Rj, j = 
1,..., 7. The species are 


¢ the inactive promoter P (S}), 
° the active promoter P* (S>), 
¢ the monomer M (53), 

* the dimer D (S4). 


Each reaction R; present in the system is no longer associated with a rate constant but a 
propensity function a; (x) and a state-change vector v ;. According to considerations based 
on quantum mechanics [22], the propensity function of uni- and bi-molecular reactions, 
such as the ones present in this system, can be written as a product of a constant c; and 
a function that depends on the state vector of the system, x. Hence, aj (x) = c; f(x). The 
reactions that take place in the dimer generation are described as 


Reaction (Rj) vj aj (x) 

R, (promoter activation) (-1, 1, 0, 0) ay(x) = cy x1 

R> (promoter inactivation) (1, —1, 0, 0) an(x) = c2x2 

R3 (monomer synthesis) di, -1, 1, 0) a3(x) = c3x2 

R4 (dimer formation) (0, 0, —2, 1) a4(x) = c44x3 (x3 — 1) 
Rs (dimer dissociation) (0, 0, 2, —1) a5(x) = c5.x4 

Ro (monomer degradation) (0, 0, —1, 0) a6(X) = C6x3 

R7 (dimer degradation) (0, 0, 0, —1) a7z(x) = c7x4 


It should be noted that for first-order reactions (all, apart from R4, in this system) c; 
is numerically identical to the corresponding kinetic rate constant (see Table 2.1). For 
a second-order reaction Rs; — associated with a rate constant k; — where two molecules 
from different species (S; and S;) interact, the propensity function has the form a; (x) = 
CsXxx), and it holds that c, = k;/(V Nay). In the reaction R4 above the reactants are 
two identical molecules and, as a consequence, a4(x) is different from a, (x). Furthermore, 
c4 = 25/(V Nav), where 6 is the dimerization rate constant as in Eq. (2.1) [22]. 
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4.4 Alternative Approaches 
4.4.1 The Tau-Leaping Method 


An approximation of the SSA, the tau-leaping method, allows the occurrence of more 
than a single reaction over a time interval t. However, t should be chosen small enough 
to guarantee that no propensity function changes considerably (therefore, it is good for 
systems with large numbers of molecules). This is the so-called leap condition. The 
number of times (k;) that each reaction (Rj) occurs within a time interval t is a Poisson 
random variable (i.e. it is determined from a Poisson distribution’) and is denoted as 
kj = pj(a;(x), T). After a t leap, the state of the system becomes 


M M 
x(t+ 1) =x(t)+ D> vjkj =x) + D> vj pj (aj), 0). (4.23) 


j=l j=l 


Hence, every time leap requires the generation of M Poisson random numbers. The usage 
of the tau-leaping method (instead of the SSA) is justified if the time interval rt is fairly 
longer than the corresponding one in the SSA and if the quantity 4 k; is clearly bigger 
than M. The tau-leaping algorithm can be summarized as: ) 


1. being the system in the state x at time ¢, estimate the largest value of t that satisfies the 
leap condition®; 

2. for each R;, j = 1,...,M calculate the number kj = pj;(aj(x), tT) of occurrences 
over T; 

3. change the state of the system from x to x + ye , vk; and the time from ¢ tot + T; 

4. go to step | and repeat the procedure or end the simulation. 


4.4.2. The Langevin Equation 


If together with the leap condition it holds that aj(x)t > 1Vj = 1,...,M (ie. the 
average value of each reaction occurrences over t, k;, is much bigger than 1), every 


7The Poisson distribution P(n, A) gives the probability that a discrete random variable K is equal 
ton, where n = 0, 1, 2, ... It is defined as 


Prob(K =n) = A(n, dA) = 


(4.22) 


where A is greater than zero and represents both the expected value (the mean) and the variance of the 
distribution. In other words, Prob(K = n) is maximal when n = X. In the tau-leaping method, K = 
k; and A = a;(x)t. Each k; can be computed with the inversion method after drawing a random 
satubed r2 from the uniform ‘distributing between zero and one and setting r2 = Prob(k; =n). 


8Tp some variants of the tau- -leaping algorithm T is fixed. 
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Poisson random variable pj; (aj;(x), T) can be approximated by a Gaussian (or Normal’) 
random variable with mean value equal to a; (x)t and standard deviation corresponding to 


Jaj(X)T Le. 
pj(aj(x), T) © Vj (aj (xt, aj (x)t) = aj(x)t + ,/a;(x)t-4 (0, 1). (4.24) 


By using Eq. (4.24), we can rewrite Eq. (4.23) as 


M M 
x(t +7) =x(t)+ a vjaj(x)t + a vj/aj(~Nj(0, IVT, (4.25) 


j=l j=l 


where the ./; (0, 1) are statistically independent normal random variables. Equation (4.25) 
is called Langeving leaping formula. Since -V (0, 1) is a continuous distribution, also the 
state of the system, x(t + T), can be regarded as a real variable. Therefore, Eq. (4.25) can 
be turned into a differential form after dividing each member by t!° 


dx(t) _ 
dt 


M M 
Yo vjajx@)) + Do viaje, (4.26) 
j=l 


j=l 


where &;(t) are Gaussian white (i.e. uncorrelated) noise. !! Equation (4.26) is known as 
Chemical Langevin Equation (CLE). If we divide the two members of Eq. (4.26) by the 
volume V in order to express the states of the system as concentrations [x(t)] rather than 
particle numbers, the CLE becomes 


d = M 
m0) =) vjaj(K@) + Yo»; OO) (. coe 
j=1 j=l 


In the thermodynamic limit (V —> ow, xj —> ow,Vi = 1,...,N and the species 
concentrations stay constant), the noise term disappears from Eq. (4.27) and the CLE 
becomes a system of ODEs as in the deterministic case. 


4.5 Deviant Effects 


In general, if the number of species is small (e.g. a circuit made of few genes and proteins), 
changes in species concentration are not negligible (see Eq. 4.6) and stochastic simulations 
should be run. As the number of species in a system increases, fluctuations average out and 


9A Normal distribution with mean value je and variance o2 is defined as VW (uw, 07) = 


_ pw)? 


1 Qo 


210 
t is renamed as dt and the limit dt — 0 is then considered. 


"it holds that ( (NE; (')) = 6;,6(t — 1’). 


, where x is a continuous (real) variable. Notice that -V (u, o2) =put+oNV(0, 1). 


e 
10 
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deterministic models can be used. However, this is not completely true as pointed out by 
Samoilov and Arkin [44]. From their analysis, stochastic effects can play an important 
role also outside this commonly accepted scenario. Three possible deviant effects from 
classical chemical kinetics (such as mass action) are presented in [44]. Here, we discuss 
only one of them, termed type II. 

Let us consider a system made of a Michaelis-Menten enzyme-substrate reaction — see 
Eq. (2.46) — and a binary enzyme inactivation process 


(ky -1) ky 
Se =” (es) 2 PE 
pip... (4.28) 


According to a deterministic analysis, when t —> oo S concentration reaches a plateau, 
whereas F concentration goes to zero (provided that S >> E at t = 0). However, 
stochastically the system behaves differently when the initial number of molecules of E 
is odd. In this case, E never disappears from the system and, therefore, S is completely 
degraded (see Fig. 4.4). Hence, a discreteness in E has repercussions on S no matter the 
amount of S at the beginning of the simulation. Due to its nature, this kind of effects is 
called discrete rather than purely stochastic. 


Take Home Message 

¢ The dynamics of a synthetic gene circuit, mainly if made of few components, may be 
hardly predictable by a system of ordinary differential equations. In such eventuality, 
stochastic simulations are preferred in order to calculate the circuit temporal behavior. 

¢ Probability calculations are at the basis of stochastic algorithms. 


Deterministic Stochastic 
simulations approach 


Eis even 


Eis odd 


Fig. 4.4 Representation of discrete effects 
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The SSA allows to simulate gene circuit dynamics by evaluating, at each step, which 
reaction will take place over the next (short) time interval. It might be computationally 
demanding. 

Variants of the SSA, such as the tau-leaping method, allow more reactions to be fired 
within the same time interval. When appropriate, they reduce the duration of computer 
simulations. 

In general, the dynamics of genetic networks made of a high number of molecules and 
interactions can be faithfully obtained via a deterministic approach. However, deviant 
effects from classical chemical kinetics are possible. 


Check for 
updates 


What You Will Learn in This Chapter 

Previously, we have described basic bio-devices (transcription units) where a promoter, 
under the action of RNA polymerases, transcription factors, and chemicals led the 
expression of a reporter protein. For each circuit, we drew, first, a graphical representation. 
We listed, then, the species and the reactions to be considered in a model. Finally, we chose 
a kinetics, which permitted the translation of the bio-chemical reactions into ordinary 
differential equations. The dynamics of the circuit was obtained by solving, numerically, 
an ODE system. 

The temporal behavior of a synthetic gene circuit is characterized by an initial transient 
phase followed by a steady state where species production and decay reach an equilibrium. 
Synthetic gene circuits can have more than a single steady state. This property is referred 
to as multistationarity. Moreover, steady states can be stable or unstable i.e. they can 
either persist or vanish after a small perturbation in the concentration of one or more 
species. In general, transient phase and equilibria depend not only on the circuit scheme 
but also on the values of rate constants (and other kinetic parameters) and the species initial 
concentrations. 

Stability analysis is a technique that permits to characterize the steady states of synthetic 
gene circuits. In this Chapter we will apply stability analysis to two simple systems — made 
of just one or two molecules — and determine the nature of their steady states. 
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5.1 Steady State Calculation: A Monomolecular System 


Our initial system is made of a single species, S. We model its production as a Oth- 
order reaction! with rate constant k, and its degradation as a Ist-order reaction with rate 
constant kg 


= 
ge 2 (5.1) 


Over time, S varies according to the following ODE 


dS(t) 
dt 


= k, — kaS(t). (5.2) 


From Eq. (5.2) we can see that the system has a unique steady state corresponding to 
SS = ie. Hence, the final concentration of S depends on the ratio between the production 
and degradation rate constants. The dynamics of the system is obtained by solving Eq. (5.2) 
analytically. To this purpose, let us define a new variable / as 


i=..=—i8 (5.3) 


whose infinitesimal increment, d/, corresponds to 


dl = —kagdS. (5.4) 
In the variable /, Eq. (5.2) becomes 

dl 

a —kadt. (5.5) 


Let us integrate the left member of Eq. (5.5) over / and the right member over ft. This 
requires to define the time fo at which the integration starts and the initial condition of the 
system i.e. the value of / at t = tg (Jo). We can set fg = 0, whereas Jp = 1(0) = ks —kg S(O). 
We still need to assign a value to the initial concentration of S: let us indicate it as a general 
quantity So. We can integrate Eq. (5.5) as 


1 47’ t 
dl , 
[Fafa 

Io 0 


log!’ |}, = —kat 


'The order of a reaction corresponds to the number of its reactants. A Oth-order reaction has no 
reactants. For instance, basal transcription (promoter leakage) is usually modeled as a Oth-order 
reaction. 
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Fig. 5.1 Different dynamics s k,/kg> SO 
of S withrespecttothe choice Po eee ee 
of So, ks, and kg 


So = $0 
k,/kg < SO 
(i) 
log | — ) = —kat 
lo 
l —kat 
—-=e, (5.6) 
lo 
Therefore, as a function of time, / becomes 
L(t) = Ibe! (5.7) 


If we substitute Eq. (5.3) into Eq. (5.7) and recall that Jo = ks — kaSo, we express S as a 
function of time 


ks — kaS(t) = (ks — kaSo)e*#" 


kaS(t) = ky — (ks — kaSp)e~™# 


s(t) = ~~ geet (5.8) 
3G 


Equation (5.8) gives the dynamics of S. At tf = 0, S = So, whereas at steady state (i.e. as 
t approaches infinity) we find, as expected, that S(t > 00) = Ez. 

Our monomolecular system has a single steady state where the concentration of S is 
well defined after assigning numerical values to the rate constants ks; and ky. The initial 
concentration of S, So, has no effect on the system steady state but has an influence on the 
system dynamics. Indeed, S(t) looks different if we chose So greater than, lower than, or 
equal to S**, as shown in Fig. 5.1. 
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5.2. Stability Analysis: A Two-Component System 


The monomolecular system in the above example has a unique steady state. As we have 
seen, once k, and kg have been fixed, the steady state is insensitive to changes in the initial 
concentration of S. Therefore, if we imagine to slightly perturb the system by removing 
a small amount of molecules of S at the equilibrium, S would simply reach S** again 
obeying to Eq. (5.8). This is a stable steady state. 

Another example of stable steady state is given by a pen lying horizontally on a table. 
In contrast, a pen placed vertically on a surface is still in a steady state but the steady state 
is unstable. Indeed, a weak push would make the pen fall down (into a stable steady state). 

One can carry out stability analysis to determine the nature of a steady state. This 
demands to study how a system reacts to small perturbations in species concentrations at 
the equilibrium. Let us consider a system made of two species: S; and Sz. Their dynamics 
is described by ordinary differential equations such as 


dS\ 


7 fi(S1, S2) 
dS 
= = folSi, Ss). (5.9) 


ODE systems as in Eq. (5.9) often contain non-linear terms (e.g. a multiplication of 
the concentrations of the two species). However, in the proximity of a steady state, the 
dynamics of a system can be approximated by performing a system linearization.? How 
to linearize the system in Eq. (5.9)? First, we shall expand the functions /|(S;, $2) and 
f2(S1, S2) in Taylor series in a neighbourhood of the steady state (S}*, S5°) 


2 


fe a2 fi 
(S; — SS) + — 
eg pd aS; 9Sx|,, 


(Sy= 5; Se = 8p ye i=1,2 (5.10) 


Fi(S1, So) = fi(ST*, S3°) + = 


95; 


sS 


where |,; means “evaluated at steady state”. Then, we can neglect in Eq. (5.10) the 
derivatives of order higher than one. This is justified by the fact that their contribute to 
Fi CS, Sz) is very small. We are left with 


dS of Of 


= Si, S5 S S _ sss 
Te Si lS; D+ 35° ve M+ asl. (S2 — S53") 
dS 
= fo(S}*, 83°) + i. Gr SP) + re. Ga- 55°). (5.11) 


2This approximation holds for a very small neighborhood of the species concentrations at steady 
state. 
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The steady-state condition implies that f;(S}*, 55°) = 0,i = 1,2. Hence, Eq. (5.11) 
become finally 


dS; Of) of 
= Ss A peel daca S = AS 
a a5i|,, °! 1+ 95, 2) 
dSo Of2 Of2 
= S StS) + Sy — S5°). 5.12 
ze B51 les! 1) 393, ° om) ( ) 
The matrix 
dS} SS So ss 
J= (5.13) 
dS} SS So SS 


is called Jacobian matrix and allows to rewrite Eq. (5.12) as 


dS me 

7 ~IS-8 ) (5.14) 
where S = (5), Sp)" and S85 = (S{°, S55)". Equation (5.14) is an inhomogeneous linear 
ODE system since it contains constant terms (—JS*’). Equation (5.2), studied above for 
the monomolecular system, is another example of inhomogeneous ordinary differential 
equation. In order to solve it analytically, we had to define a new variable (/) such that 
the transformed equation (4 = —kgl) did not contain any constant term i.e. it was 
homogeneous in |. Therefore, to solve Eq. (5.14) we have to turn it into a homogeneous 
linear ODE system. This can be achieved by setting S* = (S — S*): 


= Js*. (5.15) 


dt 


Let us go back to our monomolecular system in Eq.(5.1). This system is already linear 
and, if we set S = Sj and f|(S1) = ks — kgS,, Eq. (5.2) can be rewritten as 


dS; _ afi 
dt dS, \|,, 


SS ks 
(S1 — Sy") = —ka(Si — —), (5.16) 
ka 
where we simply rescaled the system to the steady state S}* = de, With a new variable 
d 


St = §,; - Ez, Eq. (5.16) becomes homogeneous 


dS* 


= —kyS*, 5.17 
oF d (5.17) 
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which is simply a re-formulation of Eq. (5.5). Therefore, Eq. (5.17) admits a solution of 
the form? 


S¥(t) = St#Oe—™. (5.18) 


Overall, a homogeneous ordinary differential equation is solved by an exponential 
function. Therefore, the solutions of the ODE system in Eq. (5.15) can be written as 


STO = cet! 


S¥(t) = ene, (5.19) 


where c; € R, i = 1,2 anda; € C, i = 1,2. Thus, the difference between each 
S;(t) and its corresponding steady states S’* is proportional to an exponential function 
containing A;. What do the A; coefficients represent? Let us consider the time derivatives 
of the functions in Eq. (5.19). They are 


dS* 

ae = Aycie*! = AST) 

dS* 

ae = hocne*2! => A2S5(t), (5.20) 


which, together with Eq. (5.15), leads to the matrix relation 
JS* = AIs* 
(J —aDS* =0. (5.21) 


Hence, the complex coefficient 4; , i = 1, 2 are the eigenvalues of J © The real part of Aj, 
Re(A;), determines if, upon small perturbation, 5; (tf) comes back to or goes away from 
its steady state for increasing values of t (see Fig. 5.2), whereas the imaginary part of A;, 
Im(Aj), is responsible for S; oscillatory behavior.° Therefore, the sign of Re(A;) is what 
determines if a steady state is stable or not. 


3Since S(O) = $,(0) — Si, Eq. (5.18) is identical to Eq. (5.8). 
4Notice that a linear combination of S i (t) and S35 (t) is still a solution of the linear homogeneous 
ODE system in Eq. (5.15). 


ST in Eq. (5.21) is the identity matrix, here corresponding to 


10 
I= ( ) (5.22) 


©Remember that e!? = cos(t) +i sin(t). 
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Fig. 5.2 Meaning of A S1-85° 

coefficients. Upon a small 

perturbation (c1) at steady 

state, S; goes back to its C1 

equilibrium value, S*, only if 

the real part of 2 is negative. 

Here we are considering a SSC Mt 
monomolecular system where 
Im(A) = 0 


5.2.1 Eigenvalues Calculation 


In order to calculate the eigenvalues of J one has to solve the so-called characteristic 
polynomial of J, that follows from the condition 


det(J —AD = 0, (5.23) 
and corresponds to 
WM? —ACSit + Jo2) + Jit 22 — Si2Ja1 = 0. (5.24) 


Notice that in Eqs. (5.23) and (5.24) A is a variable and Jjj, i = 1,2, 7 = 1,2 are the 
matrix elements of J. Since the trace of J is defined as Tr(J) = Ji; + Jo2 and the 
determinant of J corresponds to det (J) = Ji1 J22 — Ji2J21, Eq. (5.24) can be rewritten as 


7 —Tr(Ja+det(J) =0. (5.25) 


The two solutions of Eq. (5.25) have the form 


2 
M2 = ao +,/ a det(J). (5.26) 


Hence, 1,2 are real only if 


2 
ao > det(J). (5.27) 


Otherwise, Aj,2 are complex and S;,2 show temporal oscillations. 
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5.2.2 A Stability Criterion: The Routh-Hurwitz Theorem 
If we write the characteristic polynomial of an n-dimensional system as 
M tao hie aD (5.28) 
all its roots’ have negative real parts if and only if 
Aj > 0 Vi,i € {1,...,n} (5.29) 
where A, are the principal diagonal minors® of the Hurwitz matrix H 


a) 100 00--- 0 
a3 a2 a) 100--- 0 
H—= a5 a4 a3 a2 a, 1--- 0 4 (5.30) 


0000 00-::--a@ 


So, for instance 


Al=q 
a, l 
Ay =|" = a1ad2 — 43 
a3 a2 
a, 1 O 
A3 = |a3 a2 a\| = a1a2a3 + ayas — aya4 - ay. (5.31) 
as a4 43 


Following Eq. (5.28), the characteristic polynomial of a two-component system has the 


1 
form 4? + aA + a2 = 0 and the corresponding Hurwitz matrix is H = [ such that 
a2 
A, = qa, and Az = a,a2. This implies the following stability conditions 
aj,a2 > 0. (5.32) 


7Root means solution of the characteristic polynomial. 


8. minor is the determinant of a square matrix that is a submatrix of another matrix. 
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5.2.3. Steady States in a Two-Component System 


When applied to the characteristic polynomial in Eq. (5.25), the stability conditions in 
Eq. (5.32) become 


Tr(J) <0 
det(J) > 0. (5.33) 


In other words, when in a two-component system the relations in Eq. (5.33) are satisfied, 
both Re(A;) and Re(A2) are negative and the corresponding steady state is stable. More 
precisely, the following situations are possible 


* y,A2 € Rie. A], A2 < O= > stable node; 
e y,A2 € Cie. Re(A1), Re(A2) < 0 = > stable focus. 


The difference between a stable node and a stable focus is shown in Fig.5.3. Since we 
are not considering any reactions (i.e. any rate constants) explicitly, the dimensionality of 
the phase space of our system is dictated only by the number of species (two). Therefore, 
the phase space coincides with the plane (S;, Sz). Here, we can see the trajectory our 
system follows when slightly moved away from its steady state (Fig. 5.3, upper graphs). 
If the steady state is a stable node, the system returns to it following a straight line. In 
contrast, when the steady state is a stable focus, a small perturbation gives rise to a spiral 
in the phase space. The species dynamics is also different (Fig. 5.3, lower graphs). Indeed, 
oscillations in the concentration of both S; and S2 are present when the steady state is a 
focus.” 

Whenever at least one of the expression in Eq. (5.33) is not satisfied, stable steady states 
are no longer possible. Different scenarios can arise: 


* 2y,A2 € RanddAj,A2 > 0 = 5 unstable node; 

© A4,A2 € Cand Re(A}), Re(A2) > O ==> unstable focus!?; 

° Ay,A2 € Rand either 4; > 0 andaAz < OordA; < OandAz > 0 => saddle point 
(unstable steady state). 


°Both stable node and focus are considered as asymptotically stable since a system approaches them, 
after a small perturbation, when t —~> oo. Notice that when the steady state of the linearized system 
is asymptotically stable, the steady state of the original system is asymptotically stable too. However, 
the two steady states can be of different type i.e. a stable node in the linearized system and a stable 
focus in the original one or vice versa. 


10Tf 41,42 € C and Re(Ay) and Re(A2) have different sign, no pattern arises in the phase space. 
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Stable node 


Fig. 5.3. Comparison between a stable node and a stable focus 


S, 


Ss, 
Unstable node Unstable focus 
S, 
M<0 
ho> 0 
Saddle point 


Fig. 5.4 Unstable steady states 


The three kinds of unstable steady states!' are shown in (Fig.5.4). Trajectories in 
proximity of a saddle point bring the system close to the steady state (A; direction) and 
then far away (A2 direction). 

When A1,2 are pure imaginary number (i.e. A1, A2 € C and Re(A1), Re(Az) = 0), the 
linearized system oscillates around a centre (see Fig. 5.5a). This steady state is unstable. 
When the linearized system show a centre, nothing can be inferred about the original 
system. 


‘1 Unstable nodes, unstable focuses, and saddle points in the linearized systems correspond to 
unstable steady states in the original system. However, the kind of the unstable steady states could 
be different (as it happens also with asymptotically stable steady states). 
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A S, 


Centre 


Stable limit cycle 


Unstable limit cycle 


Fig. 5.5 Oscillations. (a) A centre in the phase space. (b) Limit cycles 


Fig. 5.6 Cartoon A B 
representation of the system in k k 
our example — * © _—— 

ka 207 ska 


Finally, oscillatory behaviours can give rise to limit cycles. In the phase space, a limit 
cycles is an orbit that is never reached by any trajectory. Trajectories, indeed, tend to wind 
around to (stable limit cycle) or away from (unstable limit cycle) their limit cycle (see 
Fig. 5.5b). There is no condition for the existence of limit cycles in a linearized system. 


5.2.4 Example 


Let us consider a two-component system where the species A is produced via a Oth-order 
reaction with rate constant k; and downregulated by species B (rate constant k,) via a Hill 
function where n = | (no cooperativity). B is produced by A with rate constant k; both A 
and B are degraded with decay rate constant kg (see Fig. 5.6). 
Parameter values are given in Table 5.1. The initial concentrations of A and B are zero. 
The system is associated with the following ordinary differential equations 


ae 25 (K+kgAt+k 
dt = Ks d a 
dB 
=kA—kgB (5.34) 


dt 
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Table 5.1 Arbitrary values for 
the rate constants in our 


Rate constant | Value | Unit 


ks 1.0 |Ms7! 
example 
k i |s-* 
kr 1.0 |Ms—! 
ka 0.1 |s! 
Ky 2 M 
to which corresponds the Jacobian matrix 
- el 
(k + kg) Ky a+e-y 
J= ‘ (5.35) 
k —kg 


In order to determine if the steady state of the linearized system is stable, we have to 
evaluate the conditions in Eq. (5.33). They become, for our system 


Tr(J) = —k —2ka <0 
kky 


det (J) = kalk + kg) + ——_ > 
et (J) d( d) Kul + =p 


(5.36) 


Since Eq. (5.36) are always true, our linearized system has a stable steady state. To find 
out its kind (stable node or stable focus) we shall calculate both Tr(J) and det (J). If the 
condition in Eq. (5.27) is satisfied, the two eigenvalues of J are real and the steady state is 
a stable node. Otherwise, both eigenvalues of J are complex and the steady state is a stable 
focus. In order to evaluate det (J), we need to compute B**. After setting Eq. (5.34) equal 
to zero, we obtain 


BS = K ass 
ks 
k? + kk a  kks —kkagKy —k2K 
om yee ge — d” "As — k, — ky =0. (5.37) 
H d\H 


The second-order equation in Eq. (5.37) is solved both by A** = 1.05M and A** = 
—0.34 M. Since a concentrations cannot be negative, we accept only the positive solution, 
which implies that B°* = 10.5 M. 

With the choice of parameter values in Table (5.1) and the just calculated steady state 
concentration of B, we find that 


Tr (J) 
4 
det(J) © 0.12. (5.38) 


TH = 12 = = 0.32 
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Hence, Eq. (5.27) is satisfied. As a consequence, both eigenvalues of J are real and the 
steady state of our linearized system is a stable node. 


Take Home Message 

¢ Synthetic gene circuits can settle into stable or unstable steady states. The nature of a 
steady state is determined via stability analysis. 

¢ Stability analysis demands to linearize the ODE system in a neighborhood of the steady 
state, determine the corresponding Jacobian matrix, and calculate the eigenvalues of the 
Jacobian matrix. 

¢ The eigenvalues of the Jacobian matrix are complex number. The steady state of 
the linearized system is asymptotically stable if the real part of each eigenvalue is 
negative. The imaginary part of the eigenvalues is responsible for oscillations in species 
concentrations. 

e Ifasteady state in the linearized system is stable or unstable, the corresponding steady 
state in the original system will be stable or unstable too. However, the steady state 
of the original system is not necessarily of the same type of the steady state of the 
linearized system. 


Check for 
updates 


What You Will Learn in This Chapter 

The first two synthetic gene circuits, which appeared in Nature in January 2000, are 
referred to as the Toggle Switch and the Repressilator. As the name suggests, the former 
switches from high to low fluorescence (and vice versa) under the action of two different 
input signals. The latter, in contrast, shows oscillations in fluorescence expression. 
However, both circuits work in these ways only if certain conditions about kinetic 
parameter values and initial species concentrations are met. Stability analysis, whose 
coverage started in Chap.5, permits to find out the requisites under which a synthetic 
gene circuit settles into a steady state that represents a properly-working configuration. 
In this Chapter we will analyze the model of these two synthetic gene circuits with the 
open-source software XPPAUT. First, we will show how to design a hysteresis curve for 
the Toggle Switch and determine the requirements for the circuit to lie into two possible 
stable steady states (multistationarity). Then, we will obtain a graphical representation for 
a so-called Hopf bifurcation that, in the Repressilator case, signals the presence of limit 
cycles that guarantee sustained oscillations. 


6.1 The Toggle Switch 
6.1.1 General Description 


The Toggle Switch is based on two transcription units encoding for different repressor 
proteins (see Fig.6.1). The presence of one repressor shuts down the expression of the 
other (mutual inhibition). One transcription unit is bi-cistronic: the first cistron encodes for 
a repressor, the second one for a reporter protein. Therefore, depending on which repressor 
is expressed, the circuit can lay into two different steady states, one corresponding to high 
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Fig. 6.1 Scheme of the Toggle 
Switch. uw and v refer to the 
dimensionless circuit model 


R,(v) R; (u) GFP 


and the other to very low fluorescence. If the two states are stable (i.e. the circuit cannot 
exit from any of them without receiving a specific input signal), the Toggle Switch behaves 
like a cellular memory and, as such, it might be employed as a basic component for bio- 
computing systems. 

Gardner and co-authors [21] presented two versions of the Toggle Switch. One 
employed the bacterial LacI and TetR repressors, the other LacI and a temperature- 
sensitive variant of the 2 phage cI repressor, termed cI(ts). The switch from one state to 
the other one was realized by sending an input signal to the circuit. When the switch was 
into the LacI-controlled position, a certain amount (experimentally determined) of IPTG 
assured complete LacI inhibition such that either TetR or cI(ts) was produced and bound 
its target promoter. TetR was de-activated by aTc (anhydrotetracycline), whereas clI(ts) 
required a thermal pulse. The Toggle Switch implementation was driven by theoretical 
considerations. The circuit model, which we review below, permitted to determine two 
necessary conditions for bistability i.e. the presence of two stable steady states. They are: 
cooperativity (Hill coefficient n > 1) among the molecules of at least one of the two 
repressors, and a balance in the expression of the two transcription factors (the latter 
was proved by using different RBSs for LacI production). According to these results, 
five working versions of the Toggle Switch were implemented. One of them maintained 
bistability for at least 22h. 


6.1.2. Model Based on Hill Functions 


Following the circuit scheme as designed by Gardner et al. and reproduced in Fig. 6.1, we 
place promoter p; under the control of repressor R; and promoter pz under the action 
of repressor R2. p; leads the synthesis of R2 and p2 the synthesis of R;. Promoter p2 is 
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responsible for the production of the green fluorescent protein (GFP) too. Both repressors 
can be inhibited by the corresponding inducer signals (i; and iz). As mentioned above, we 
expect that the circuit lies into a stable steady state if one of the two repressors is fairly 
more expressed than the other. In particular, when R; >> Rz2, the circuit emits a clear 
fluorescent signal. 

Initially, we derive a reduced model for the circuit where the inducer signals (and the 
fluorescent protein) are neglected, and translation is treated as a single-step event. This 
simplified model will be used to carry out stability analysis with XPPAUT.! 

The Toggle Switch model based on Hill functions demands only two species, Ry and 
Ro, and eight parameter values: the translation (k; and kz) and decay (kg; and kg) rate 
constants of the two repressors together with their Hill constants (Ky; and K 2) and Hill 
coefficients (n; and n2). Overall, the circuit is described by a system of two ODEs 


Aa =k; a kaiR\ 
1+ (zh) 

dR 1 

a =k) El kq2 Ro. (6.1) 
1+ (zn) 


Gardner et al. reported in their paper a model slightly different from the one in Eq. (6.1), 
where dimensionless rate constants were used. We can assume that kgj = kg2 = kg and 
define the dimensionless variables 


dt 
to=kgt —=dt=— 
ka 
Ri 
u= => dR\ = Ky,du 
Kn, 
Ro 
v= => dR = Kyydv. (6.2) 
Ky 


Moreover, following the notation in [21], we should recall n; as y and n2 as 6. With the 
new variables in Eqs. (6.2), (6.1) can be written as 


du 
kaK ny =e kaKy,u 
i oh Se (6.3) 
= Vv. * 
d 2 ae a7 eu? d® Hy 


'XPPAUT is available at http://www.math.pitt.edu/~bard/xpp/xpp.html. We use XPPAUT version 
7.0 on a Mac OS X Yosemite. 
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If then we set 


~ kaKny, 
ky 
= (6.4) 
kaKn 
we finally have 
du _ 1 
a eae 
dv 1 
a 6.5 
dt 14ur 6) 


where, for the sake of simplicity, we have renamed the variable t as t (which still 
represents a dimensionless time). 


6.1.3 Stability Analysis with XPPAUT 


In Chap. 5 we have seen how to make use of model linearization in order to determine 
the nature of the steady state(s) of a biological system. With XPPAUT we can derive 
numerically, from the system model, the conditions under which a circuit has one or 
more steady states and how their nature changes (stable/unstable) by varying the kinetic 
parameter values. 

Let us consider the Toggle Switch model in Eq. (6.5). In order to find out how many 
steady states are possible, we have to set both ODEs to zero. The curves a = 0 and a = 
0 are called nullclines. If we plot them in the plane (uw, v), their intersection points coincide 
with the circuit steady states. We expect to have a stable steady state corresponding to an 
excess in the concentration of either repressors. How do the nullclines change either by 
varying the values of a; and a2 or removing cooperativity from the system? 

XPPAUT takes as an input a text file (let us call it “toggle.ode”’) containing the circuit 
ODEs, the initial concentrations of the species in the system, and numerical values for the 
kinetics parameters. Using the syntax recognised by XPPAUT, Eq. (6.5) are written as 


# toggle-switch 
du/dt= alphal * 1/(1+vAbeta) - u 
dv/dt= alpha2 * 1/(1+uAgamma) - v 


init u=0 v=0 
par alphal=10 alpha2=10 beta=2 gamma=2 
done 


The first line contains just a comment. With the instruction “init” one sets the species 
concentrations, with “par” the parameter values. Here, we start our analysis by demanding 
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Fig. 6.2 Nullclines corresponding toa, = a2 = 10 and B = y = 2 


that a1 is equal to a2 (balanced promoters) and both 6 and y are greater than | i.e. the two 
repressors show cooperativity effects. To start circuit analysis, drag the file “toggle.ode” 
onto the XPPAUT window (let us call it input window). The file gets processed and, if 
no error has been found, another window appears (let us refer to this window as the main 
one). In order to visualise the nullclines, click on “Viewaxes” and then on “2D”. In the new 
window that pops up set “X-axis: uw’, “Y-axis: v’”, “Xmin” and “Ymin” equal to zero, and 
“Xmax” and “Ymax” equal to 12. Click on the “OK” button and this window is closed. 
Back to the main window, click now on “Nullcline” and then on “New”. The nullclines, 
drawn on the Cartesian plane, look like in Fig. 6.2. 

They intersect in three different points. To know if they are stable or unstable steady 
states, click on “Sing pts” and then “Mouse”. Now, click on one of the three intersection 
points, click on “Yes” when the question “Print eigenvalues?” is prompted and “No” for 
“Draw Strong Sets?” (or “Draw Invariant Sets?’”). A new window displays the kind of 
the steady state, its coordinates (species concentrations) and how many eigenvalues are 
both real and lower than 0 (“r7”’) etc. (notice that the eigenvalues are written in the input 
window). The two steady states where one of the two species is much bigger than the 
other are stable, whereas the steady state in between, where u and v have the same value, 
is unstable. 

Let us remove cooperativity from our system by setting 8 = y = 1. Click on “Param” 
on the toolbar of the main window, modify 6 and y values, click on “OK”, and then 
“Close”. Plot the new nullclines as described above.” In the absence of cooperativity, the 
nullclines have no sigmoidal shape and meet in a single point (see Fig. 6.3a). This is a 


2 If the previously plotted nullclines are still on the graph, click on “Graphic Stuff” and “Delete Last” 
to get rid of them. 
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Fig. 6.3 Nullclines corresponding to different choice of a1, a2, 6, and y. (a) In the absence of 
cooperativity, and with promoters of equal strength, the system has a single stable steady state where 
the two repressors are equally expressed. (b) In the presence of cooperativity and with one promoter 
much stronger than the other, the system shows a single stable steady state again where, however, 
only one repressor is basically produced 


stable steady state. Therefore, cooperativity is a necessary condition to have bistability 
i.e. two stable steady states. Is it also a sufficient condition? Let us set back B = y = 2 
and then change a to 1. In this way, u is much more expressed that v. Plot the nullclines 
again. They intersect in a single point, a stable steady state (see Fig. 6.3b). Therefore, 
cooperativity is not sufficient to guarantee system bistability that, in fact, depends also on 
the values of a, and ap. 
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Fig. 6.4 Bifurcation diagram for a generic bistable system. A species concentration S is shown as 
a function of a kinetic parameter k, whilst the other system parameters are supposed to be fixed 


6.1.4 Bistability and Hysteresis: General Concepts 


Bistability and hysteresis are, often, found in circuits that contain a positive feedback loop 
(see Chap. 12). The plot that visualizes them is called bifurcation diagram. Let us forget for 
a moment about the Toggle Switch and consider a generic bistable system. If we plot the 
concentration of a species S with respect to a kinetic parameter k, we will get a bifurcation 
diagram like the one in Fig. 6.4. 

The term hysteresis means, literally, lagging behind and was coined by studying 
magnetic materials. It refers to the fact that the state of a system depends on its past 
conditions. Therefore, hysteresis can be exploited to build memory devices. Looking at 
the diagram in Fig. 6.4, we see that our generic system can lie into two different kinds of 
stable steady states: one characterized by small concentrations of S (let us call it Jow) and 
the other by big concentrations of S (high). If, initially, we set k = k;, the system will lie 
in the low stable steady state marked by a blue circle. By increasing the value of k, the 
system stays in a /ow stable steady state until k = k,. As soon ask > k,,, the system jumps 
into a high stable steady state i.e. S concentration grows discontinuously. In an analogous 
way, if we initialize the system at k = ky, the system will be in the high stable steady state 
indicated with a green circle. By decreasing k, the system keeps lying into a high stable 
steady state until k crosses kg and the system falls into a low steady state. Here, there is 
another discontinuity in S concentration. When k < kg ork > k,, S is either poorly or 
greatly expressed and there is a unique type of stable steady state where the system can 
lie. In the bistable region i.e. kg < k < k,, the system chooses between a low and a high 
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stable steady state depending on the past values of k. The points | (at which k = kg) and 
2 (k = k,) i.e. the red circles in Fig. 6.4 are called saddle-node bifurcation points. Saddle- 
node refers to the kinds of steady states that these points separate; bifurcation means that 
the number of steady states changes when the system crosses these points. 


6.1.5 Bifurcation Plots with XPPAUT 


In order to plot with XPPAUT a hysteresis curve for the Toggle Switch, we have to place 
the circuit into a steady state (preferably the unstable one). Re-draw the nullclines for 
a; = a2 = 10 and B = y = 2 and find out u and v concentration at the system unstable 
steady state.> Click on “ICs” on the toolbar of the main window and set as new initial 
conditions the concentrations of u and v at the unstable steady state. Click on “OK” then 
“Close”. Now it is necessary to open the program AUTO. Click on “File” and then “Auto”. 
A new window appears (AUTO window). Click on “Numerics”, set “Nmax: 2000”, “Par 
Max: 1000”, and then click on “OK”. Click on “Axes” and then on “hI-lo” (meaning high- 
low). ““Y axis” should be “u” and “Main Parm” a. Set “Xmin” to 0, “Ymin” to —2, both 
“Xmax” and “Ymax” to 40. Click on “OK”. Back to the AUTO window click on “Run” 
and then “Steady state”. A first trait of the hysteresis curve is drawn. In order to complete 
it, click on “Grab”. A cross appears on the graph. By pressing “TAB” on the keyboard, the 
cross moves along the hysteresis curve. Place the cross on point 1. Press “Enter” to select 
point 1.4 Click on “Numerics” and change “Ds” to —0.02 (this changes the direction in 
the hysteresis curve design). Click on “OK” and then “Run” in the AUTO window. The 
hysteresis curve is now complete (see Fig. 6.5). Red lines indicate stable steady states, the 
black line is made of unstable steady states, they meet at the bifurcation points. 

Toggle Switch implementations pointed out that random fluctuations in gene expression 
let the two stable steady states coexist at the bifurcation points. Here, a bi-modal 
distribution in the circuit fluorescence has been detected by Gardner et al. via FACS 
experiments (see Fig. 6.6). 

As we have seen above, cooperativity is not sufficient to assure bistability and build a 
Toggle Switch. In order to find the values that aw and w2 should take to confer bistability to 
the circuit, we shall plot the so-called bifurcation lines in the plane (a1, a2). They divide 
the plane into regions where the circuit is either monostable or bistable. In order to draw 
bifurcation lines with AUTO, click on “Grab”, move the cross on the bifurcation point 
3 and then press “Enter”. Click on “Axes” and on “Two par”: the plane will be changed 
into the (a1, a2) plane automatically. Click on “OK” then “Run” in the AUTO window. 


3Click on “Sing pts”, “Mouse”, and then on the unstable steady state in the graph. On the bottom of 
the window, which contains information about the steady state, the concentrations of the two species 
are written: u = 2 and v = 2. 


4The point, on which the cross lies, is written on the bottom of the window under “Lab”. 
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Fig. 6.6 Experimental evidence of hysteresis in the Toggle Switch [21]. (a) A quasi-discontinuous 
jump from low fluorescence stable steady state into high fluorescence stable steady state was 
measured by inducing the Toggle Switch with increasing amount of IPTG. (b) Coexistence of two 
stable steady states at the bifurcation point B1/B2 
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Fig. 6.8 Bifurcation lines with AUTO (6 = y = 2). Notice that for very small values of either a 
or a2 the circuit will never work as a Toggle Switch (only a single stable steady state is possible) 


The bifurcation lines appear on the graph but the lower one is not complete. To finish its 
design, click on “Grab” and move the cross to the edge of the lower bifurcation line. To be 
sure that the cross is on the correct point, look at the circle on the left of the graph, under 
the button “ABORT”: you should see a small cross on the circle itself, as in Fig. 6.7. Press 
“Enter”. Click on “Numerics” and remove the minus sign from “Ds”. Click on “OK” and 
then click on “Run” in the AUTO window. The design of the bifurcation lines is completed. 
The two bifurcation lines divide the plane into two regions. The region between the two 
lines gives bistable conditions i.e. the values of a; and @ that guarantee circuit bistability 
when 8 = y = 2. In the rest of the plane, the circuit has a single stable steady state only 
(see Fig. 6.8). 
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Repeat the whole procedure by keeping a; = a2 = 10 and decreasing both f and y 
down to 1.5. You will observe that the bistable region is reduced. As expected, it collapses 
when both f and y are equal to 1 (no cooperativity).> Overall, if both repressors do not 
show any cooperativity (8 = y = 1), the circuit has always a single stable steady state. 
When £ and y are > 1, both a and a2 should be fairly high to assure bistability. Finally, 
if only one repressor behaves cooperatively, bistability arises only for large values of a1 
and a. 


6.2. The Repressilator 
6.2.1 General Description 


The core of the Repressilator [13] is made of three transcription units each expressing 
a different repressor protein. They are arranged into a ring configuration such that the 
first unit down-regulates the second one, the second unit represses the third one, and the 
third gene closes the circle by shutting down the first one. One of these three transcription 
factors, moreover, controls the expression of a fluorescent protein that represents the circuit 
read-out. Following the original Repressilator design (see Fig. 6.9), the three repressor 
proteins are LacI, TetR and cl. The circuit output is given by a green fluorescent protein 
(GFP). When, for instance, LacI concentration is small, TetR amount gets higher. This 
provokes, however, a decrease in cI concentration that causes an increase in LacI which, in 
its turn, represses TetR expression. Therefore, the Repressilator scheme can be adopted to 
achieve an oscillatory dynamics. Why to engineer oscillations in protein concentration? A 
possible (though still futuristic) application is to release a drug (e.g. insulin or antibiotics) 
in the human body at precise time intervals to fight against a disease. However, getting 
stable oscillations is not a banal task. They depend on a proper choice of several kinetics 


Fig. 6.9 The Repressilator 
scheme. P; tetO and Py lacO tetR 

; . P,lacO | ——————— 
are the two synthetic promoters 
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SHill coefficients lower than 1 would give negative cooperativity that is not present in transcription 
regulation. 
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parameters: promoter strength, protein half-life, mRNA decay rate, repressor-promoter 
affinity, promoter leakage, repressor cooperativity. According to their model, Elowitz and 
Leibler [13] found out that tightly regulated promoters (with scarce leakage), comparable 
half-lives of proteins and mRNAs, and repressor cooperativity strongly favour stable 
oscillations. Therefore, they chose (among the Repressilator components) three promoters 
that could assure both high protein production and tight repression, namely: two hybrid 
promoters made of the A phage P;, promoter fused to either a Jac or a tet operator (i.e. the 
binding site of LacI and TetR) and the A phage Pr promoter, regulated by cI. Moreover, the 
coding regions of the three repressors where modified with the addition of a degradation 
tag that reduced the half-lives of the repressors to a handful of minutes i.e. very close 
to the average mRNA half-live in E.coli (about 2 min). This model-driven Repressilator 
implementation allowed to detect cells that were able to maintain stable oscillations for 
at least 10h. 


6.2.2 Modeling 


A model for the Repressilator that can explain and predict oscillations in protein 
concentration requires to consider both transcription and translation explicitly. Let us call 
M, and Y;,i = 1,...,3 the mRNAs and the repressor proteins in the (core) circuit. Each 
transcription unit is modeled with two ODEs 


dU. 1 
ae =kyt+ ker — ean — kam M4; 
1+(z) 
dF; 
— =kyM—kaF, (i, j) = 1,2), (2,3), B,D (6.6) 


where kj, is the leakage rate constant (units: M/s), k;, is the transcription rate constant 
(M/s), ky is the translation rate constant (s—}), Kam 1s the mRNA decay rate (s7!) and kg 
is the protein decay rate (s~!). For the sake of simplicity, we assume no difference among 
the three mRNAs and among the three repressors. Therefore, the three transcription factors 
share the same Hill constant (K 7) and coefficient (n). 

In order to derive the same dimensionless model as in [13], we shall rescale Y;, i = 
1,...,3 to Ky and the time f to kgm 


| es 
Tt = kamt. (6.7) 


Moreover, to have a dimensionless mRNA, we introduce the quantity h (termed 
efficiency and expressed in Molars) such that 


my = (6.8) 


6.2 The Repressilator 79 
Equation (6.6) can be rewritten as 


k i a kik +k : Kamh 
—_ = —— - mj 
dm ae 1k ae pr dm i 


d 
kamKu oa = kyhmj — kaK un pi (6.9) 


that becomes 


dm; kik a Kur 1 
dt kamh — kamh + pi 
dpi kuh kaa 
dt Kam Ky ' 


™ (6.10) 
Kam p 


mite 


According to the model in [13], we shall set 6 = pt = GAKn 


efficiency h corresponds to 


. This implies that the 


hake 


h= iz (6.11) 
Finally, if we set ag = pe anda = a , we obtain the dimensionless model 
dm; 
dt” a i 
“Pi = —A(p; — mi), (6.12) 


where we renamed 1 as f.° 


6.2.3 Hopf Bifurcation 


If parameter values are well-balanced, the Repressilator exhibits sustained oscillations. 
Otherwise, the three repressors settle either into a stable or an unstable steady state 
(depending on the initial conditions of the circuit). Here, we do not carry out a detailed 
stability analysis but we simply show how the circuit temporal dynamics’ and trajectories 


6Tn a stochastic framework, k;j is expressed in number of proteins (n prot) per number of mRNA 
" prot . aa _ .-l 
AaRNAS: Since [Ky] = “prot and [kg] = s~~, the 


efficiency h results expressed in number of MRNA molecules ({h] = nmr A). 
7Simulated with COPASI. 


molecules (ny RNA) per second i.e. [kyj] = 
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in the plane (pj, p2) change by varying the initial concentrations® and the value of 
(whereas we set a9 = 0, 8 = 1, andn = 2.1 —i.e. the three repressors show cooperativity). 
The file encoding for the Repressilator model in XPPAUT has the form 


# Repressilator 

dm1/dt= alphaO + alpha/(1+p3An) - m1 
dp1/dt= -beta*(p1-m1) 

dm2/dt= alphaO + alpha/(1+p1An) - m2 
dp2/dt= -beta*(p2-m2) 

dm3/dt= alphaO + alpha/(1+p2An) - m3 
dp3/dt= -beta*(p3-m3) 

init m1=0 p1=0 m2=0 p2=0 m3=0 p3=0 
par alphaO=0 alpha=10 beta=1 n=2.1 
done 


Let us start, as in the above file, with pl = 0. Click on “Viewaxes”, “2D”, and then set 
“X-axis: pl”, “Y-axis: p2”, “Xmax: 8”, “Ymax: 8”, “Xmin: 0”, and “Ymin: 0”. Close the 
window by clicking on “OK”. Click on “Numerics”, “Total’’, and set “total: 200” in the bar 
just above “NUMERICS” (total refers to the time interval over which XPPAUT integrates 
the Repressilator ODEs). Press “Enter”, then press “Esc”. Click on “Initialconds” and 
“Go”. XPPAUT carries out the integration and a line appears in the (pj, p2) plane. Click 
on “Sing pts” and then “Go”. The steady state is unstable. Its eigenvalues are complex, 
four have their real part greater than zero, the other two lower than “zero” (which, in the 
linearized system, does not correspond to any specific pattern such as a node or a focus). 
If we repeat the whole simulation by using p; = 2, the system will show a stable limit 
cycle with sustained oscillations (see Fig. 6.10). 

Finally, if we keep p, = 2 but decrease a from 10 to 1, the system steady state changes 
into a stable node (see Fig. 6.11). Therefore, if we plot p; with respect to a, we can expect 
to find a bifurcation point that signs the passage from a stable node to a stable limit cycle. 
A bifurcation point that gives rise to oscillations is called Hopf bifurcation point. 

In order to visualise the Hopf bifurcation with AUTO, one should use, as initial 
conditions, the species concentrations at the stable node (q@ is set to 1). Hence, all six 
species should be set to 0.68729. Open AUTO. Click on “Parameter” and set the circuit 
parameters in the following order: a, 8, a9, n. Click on “OK” then “Numerics” in the 
AUTO window. Set “Ntst: 150”, “Nmax: 2000”, “Par Max: 50”, and then click on “OK”. 
Click on “Axes”, “hI-lo”, and set “Y-axis: p1”, “Main parm: alpha’, “Xmin: 0”, “Ymin: 0”, 
“Xmax: 30”, “Ymax: 20”. Click on “OK”, then “Run” in the AUTO window, and “Steady 
state”. The graph should look like in Fig. 6. 12a. 


8 Actually, P1 concentration only while we keep p2 — and the other four species in the system — equal 
toOatr=0. 
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Fig. 6.10 Repressilator equilibrium states by varying p; (0) for a = 10. (a) The circuit has a single 
unstable steady state (indicated by the arrow in the (p;, p2) plane) where the three repressors have 
the same concentration. (b) Increasing p (0) up to 2, the Repressilator shows sustained oscillations 
for each of the three repressors 


Click on “Grab” and move the cross on point 2. Its type (HB)? indicates that it is a 
Hopf bifurcation point. Press “Enter”, click on “Run”, and then “Periodic”. Now the graph 
shows the maximum and minimum value that p; takes during the oscillations at the limit 
cycle for each different value of a (see Fig.6.12b). This ends our analysis of the Hopf 
bifurcation in the Repressilator circuit. 


Take Home Message 

¢ Bistable circuits, such as the Toggle Switch, are usually characterized by the hysteresis 
phenomenon. Hysteresis implies a theoretically discontinuous jump between two stable 
steady states. Graphically, a hysteresis curve can present a saddle-node bifurcation 
points. 


°It is written on the bottom of the AUTO window under “Ty”. 
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Fig. 6.11 Repressilator steady Stable steady state 
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¢ The toggle switch is bistable only if at least one of the two repressors shows cooperative 
behavior. 

¢ The Repressilator achieves sustained oscillations only after a precise tuning of kinetic 
parameter values. In particular, cooperativity among the molecules of each repressor 
strongly favors oscillations. 

¢ A Hopf bifurcation point signals the transition from a regime of no oscillations to a 
regime of sustained oscillation. 
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What You Will Learn in This Chapter 

This Chapter aims at showing how to simulate the dynamics and analyse the steady states 
of the Toggle Switch with COPASI. Initially, we will build a simple model (without 
inducers) based on mass action kinetics in order to run both deterministic and stochastic 
simulations. The comparison of their results will show that the values of some kinetic 
parameters provoke a clear difference in the predictions of the same model if a stochastic 
algorithm is run instead of solving the ODE system associated with the model. This 
simplified model, together with the same equations as in the paper by Gardner et al. [21] 
— which we already met in Chap. 6 — will be also used to explain how to run stability 
analysis with COPASI. With respect to XPPAUT, COPASI offers less features for this task. 
However, COPASI can both calculate and determine the nature of the steady state that a 
synthetic gene circuit will reach, given the circuit initial conditions. Finally, we will write 
a more detailed model, which contains the inducer-repressor interactions, in order to show 
how to mimic in silico the measurements that are carried out in the wet-lab on living cells. 


7.1 Stochastic Simulations and Stability Analysis with COPASI 


Our first model is based on mass action kinetics. We need six species at least: the two 
repressors, Rj and R2, and the two promoters both in the active (p{, p>) and inactive ( a 
p>) state. Promoters are active in their wild-type configuration and become inactive only 
when bound by the corresponding repressor. The circuit reactions are 


AY . 
Ri Pl yi 
; 
pi —> Rit pi 
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Be 3 71) 


For our first simulations, we will use the following values for the rate constants in 
Eq. (7.1): Ay = Az = 10°M~!s7!, wy = wz = 0.181, ky = kp = 0.587! kg) = kan = 
0.00116 s~!. As initial species concentration, we can set PL = Pps = 10-° M, whereas all 
the other species concentrations are set to 0. As usual, the system volume is 1.66 10~!? ml. 
With this choice of parameter values and initial concentrations, the circuit is symmetric in 
the two transcription units. 

The model in Eq. (7.1) can be written into the COPASI graphical user interface by 
following the instructions given in Chap. 3. After inserting the eight reactions together 
with their rate constants, define a plot to visualize the dynamics of the six species in the 
circuit. First, run deterministic simulations with the LSODA method over a time interval 
of 10,000s. Not surprisingly, the two repressors reach the same concentration at the 
equilibrium (see Fig. 7.1a). In order to get information about the nature of this steady 
state, open the branch “Task” then “Steady-State”. The options “calculate Jacobian” and 
“perform Stability Analysis” should be checked. Click on “Run” and then “Stability”: 
the Jacobian! has four real, negative eigenvalues that imply that the steady state is 
(asymptotically) stable. This result is due to the fact that the model in Eq. (7.1) does not 
take into account cooperativity. Hence, only a single stable steady state is possible. 

Let us run stochastic simulations on the Toggle Switch. Under “Time Course” choose 
either “Stochastic (Direct method)” or “Stochastic (t-Leap method)”. Run stochastic 
simulations several times. Although there is no longer a perfect symmetry between Rj 
and R2, the overall circuit dynamics does not change drastically with respect to the 
deterministic results (Fig. 7.1b). Set, now, A; = Az = 10° M~!s~! and run again a few 
stochastic simulations. The increased repressor-promoter affinity can switch off for a long 
period of time one of the two promoters. This causes an evident disparity between the 
amount of the two repressors at different simulation time steps. Hence, with high values 


1More precisely, the reduced Jacobian that takes into account only the independent species of the 
system. For instance, the total concentration of p, ( pi) is a conserved quantity. Hence, one can 
express p{' as a function of pi (pi = pi - pi) or vice versa. The same holds for p2. Therefore, the 
system in Eq. (7.1) has only four independent species. 
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Fig. 7.1 Comparison between deterministic and stochastic simulations of the Toggle Switch. (a—b) 
Deterministic and stochastic simulations with medium affinity between repressors and promoters. 
No considerable difference is present between the results of the two methods. (c—d) Deterministic 
and stochastic simulations with high affinity between repressors and promoters. Stochastic and 
deterministic simulations give very different results 


of A; and A2, stochastic simulations give, in general, very different predictions from those 
of deterministic simulations (Fig. 7.1c—d). Moreover, the system equilibrium is strongly 
affected by random fluctuations in protein expression. 


7.1.1 Dimensionless Model 


Let us recall the Toggle Switch dimensionless model proposed by Gardner et al. [21] 


du _ 1 

dt i+” 

dv 1 

ii a (7.2) 
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Symbols are explained in the Chap.6.* Write the model in COPASI. Since it is 
dimensionless, the volume can be set to 1. 

We will run simulations and stability analysis by using the same values as in Chap. 6. 
Let us set a1 = a2 = 10 (the two promoters have the same strength), 6 = y = 1 (i.e. no 
cooperativity among the repressor molecules) and, as initial conditions, vu = v = 0. Runa 
stability analysis: as expected, the steady state is stable with u = v.* 

Let us keep a; = 10 but change a2 into 1. Moreover, let us set 8 = y = 2. The steady 
state calculated by COPAST is again stable and corresponds to an excess of Ry over Ro in 
the system. 

What happens if we keep 8 = y = 2 and set aj = a2 = 10? These parameters 
guarantee bistability, therefore the steady state chosen by the circuit depends on the initial 
values of u and v. If we set both equal to zero, the Toggle Switch will end up in an unstable 
steady state where u = v = 2 Change, now, u to 2: the system falls into a stable steady 
state where u ~ 9.90 and v ~ 0.10.4 


7.2 The Toggle Switch with the Two Inducers 


A more complete model, which takes into account the inducer-repressor binding/unbinding 
reactions, is written as 


ko H(n2,K H2,R5) 
— 


ky A(m,Ky1,R%) 
—> 


2 
. a 5) i 
it Ri — Ry 
Ri mae i+ Re 
1 ial + 1 
: 59 ; 
in + RS —> Ri 
- €2 : 
Rs — ig + RS 
kai 
RY “$s 
kaa, 
RS “4 
kai . 
Ri Ss al 
i kani . 
Ry SS 12, (7.3) 


2In this description, u and v represent R; and R> rescaled to their corresponding Hill constants. 


3Notice that, as it is the case of the model in Eq. (7.2), stochastic simulations can fail with kinetics 
different from mass action. 


4A symmetrical steady state is reached by setting v = 2 and uw = 0 as initial conditions. 
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Table 7.1 Kinetic parameter 


: Parameter | Value Units 
values for the Toggle Switch k 10-7 |e! 
model in Eq. (7.3) 1,2 0,5 10 

51.2 10° M7!s—1 
€1,2 0.1 si 


kaa.2i | 0.00116 | s~! 
kaa.2)a | 0.00116 | s~! 
Km 10~! M 
n\2 2 


where Ri, RS ; ae and Ri, are the active and inactive configurations of the two repressors, 
whereas i; and i2 represent the two inducers. The synthesis of the active repressors is here 
described by Hill functions. The kinetic parameter values for the reactions in Eq. (7.3) are 
listed in Table 7.1. Whenever an inducer is sent to the circuit, its initial concentration is 
set to 10~° M. Inducers are supposed not to decay but to be washed out of the circuit once 
the steady state is reached (i.e. i; and i2 are never present in a comparable amount at the 
same time). 

Let the system reach a steady state in the absence of chemicals (check “update model” 
before starting the simulation). After 10,000 s the two active repressors are present in the 
same amount, ~142 molecules. Change i; concentration to 10->M and let a simulation 
run for 10,000 more seconds. In the new steady state, RY concentration is close to 
zero, whereas R5 is highly expressed (about 431 molecules). By recalling the Toggle 
Switch scheme in Fig. 6.1, this equilibrium corresponds to a low fluorescence level. Set i; 
concentration to zero and let the system evolve for other 10,000 s: there is no big change, 
the circuit is locked into the low-fluorescence state. Set i; = 0 again (the low amount of i; 
at this steady state is due to Ri decay) and iz = 10~> M. Run another 10,000 s simulation. 
Here you see the switch from low to high fluorescence (R{ is now present in around 428 
molecules, almost 2000-fold more than RS ). Finally, remove both chemicals and run a new 
simulation (10,000 s again): the circuit remains locked into the high-fluorescence state (see 
Fig. 7.2). 


Take Home Message 

e The value of some rate constants can cause a big difference in the model simulations 
when a stochastic algorithm is used instead of a deterministic framework. 

¢ Given the circuit initial conditions, COPASI can calculate the circuit steady state and 
determine the nature of it. 

¢ COPASI allows to run circuit simulations that reflect the wet-lab procedures for 
measuring the output of synthetic gene circuits. 
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What You Will Learn in This Chapter 

As we have seen in Chap. 2, making a model for a synthetic gene circuit demands to define 
the species and the reactions necessary to predict the dynamics of the circuit output. The 
circuits we dealt with so far were structurally simple since they contained a handful of 
transcription units and their models could be derived by hands. However, genetic circuits 
made of a high number of DNA components — such as the RNAi-based logic evaluator 
[43] described in this Chapter — require computational methods in order to determine, 
in an automatic way, all reactions and species to be considered in their models. Rule- 
based modeling is a powerful computational technique that permits to derive model species 
and reactions from a minimal, abstract description of the kind of molecules involved in a 
biological system and the way they interact. In this Chapter we will give an introduction 
to the main ideas of rule-base modeling, and we will show how to derive fully mechanistic 
models for both the Repressilator and the Toggle Switch by using the open-source software 
BioNetGen [11]. 


8.1 The RNAi-Based Logic Evaluator, a Complex Synthetic Gene 
Network 


A logic evaluator is a kind of circuits that makes decisions. When engineered into cells, a 
logic evaluator senses molecules (input signals), elaborates the information they carry, and 
triggers a proper response (a bio-chemical process). 

The RNAi-based logic evaluator [43] was implemented in mammalian cells. Potentially, 
it could be used to detect sick cells (diagnostic) and trigger a pathway to cure them. 

RNA interference (RNAi) is a translation regulation mechanism typical of higher 
eukaryotes. Long, double-stranded RNA molecules are spliced into small interfering 
RNAs (siRNAs) by the Dicer enzyme. In the cytoplasm, siRNAs bind other molecules 
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to form the RISC (RNA-induced silencing complex). One siRNA strand is then degraded, 
whereas the other bind, by base pairing, a specific mRNA molecule. Upon siRNA binding, 
mRNA is cleaved by the Argonaute — an endonuclease present in the RISC — and degraded 
quickly. 

The RNAi-based logic evaluator senses, as inputs, several endogenous signals and 
makes a decision according to the result of a Boolean (logic) operation on these inputs. 
Between the input signals and the circuit output (a fluorescent protein) there are biological 
mediators i.e. the siRNA molecules. A logic multiplication (AND) is realized with the 
scheme in Fig.8.1. Here, each signal-when present — prevents the synthesis of the 
corresponding siRNA; siRNA binding sites are placed on the 3’ UTR (untranslated region) 
of the target mRNA that encodes for a fluorescent protein (FP). When either siRNA binds 
the mRNA, the nucleotidic chain is rapidly degraded and translation aborted. Therefore, 
siRNA-A and siRNA-B have to be absent in order to have fluorescence in the cell, which 
requires the presence of both A and B input signals. 

A logic NOT operation based on a single siRNA is illustrated in Fig. 8.2. Input signal 
C activates the transcription of siRNA-C. Thus, only in the absence of C (i.e. NOT(C)) 
fluorescence is expressed. 

If the two mRNAs in Figs. 8.1 and 8.2 encode for the same fluorescent protein, they can 
be composed into a circuit performing an OR (sum)! logic operation (see Fig. 8.3) since 
fluorescence appears when at least one of the two Boolean conditions is verified. 


1(4 AND B) OR (NOT (C)). 
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Following this principle, Rinaudo and co-authors engineered a complex circuit able 
to sense four inputs. These molecules control the synthesis of five siRNAs. They target 
two mRNAs that produce the same yellow fluorescent protein, as illustrated in Fig. 8.4. 
Specifically, this circuit corresponds to the Boolean formula: (A A C A D) v (A A B), 
where A stands for AND, Vv for OR, and A means NOT(A). 

How complex is a model for this circuit? Every siRNA requires two genes: one gene 
constitutively expresses an activator or a repressor to which an input signal binds, the other 
encodes for the siRNA molecule and is regulated by the transcription factor bound by the 
corresponding input signal (see Fig. 8.5). Moreover, there are two genes encoding for the 
yellow fluorescent protein. Hence, the circuit needs 12 transcription units overall. 
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The complexity of regulated promoters depends mainly on the number of operators (the 
transcription factor binding sites) they host. A single operator can lie into two states (free 
or taken), two operators give rise to 4 states, three operators generate 8 states. In general, 
n operators imply the existence of 2” different states* and each state corresponds to a new 
species in the circuit model. Further complexity arises when cooperative behavior among 
transcription factors is taken into account. RNA interference demands to consider several 
interactions such as the splicing of long double-stranded RNA molecules by the Dicer 
enzyme and the formation of the RISC in the cytoplasm. Thus, cell compartmentalization 


2This is a so-called combinatorial explosion problem since the number of promoter states grows 
exponentially in n. 
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should be present in the circuit scheme. Furthermore, in eukaryotic cells mRNA molecules 
undergo the action of the spliceosome in order to become mature. 

Both mRNA | and 2 in the scheme depicted in Fig. 8.4 lie into several different states. 
Their number depends both on how many siRNAs are targeting them and the number of 
binding sites per each siRNA. As a simplification, however, one can suppose that two or 
more siRNAs cannot bind a single mRNA strand at the same time since just one siRNA 
is sufficient to induce mRNA fast degradation. This assumption assures no combinatorial 
explosion in the states of regulated mRNAs. 

By taking into account all these eukaryotic cell processes and assigning two operators to 
each promoter and two binding sites to each siRNA, we proposed for the RNAi-based logic 
evaluator in Fig. 8.4 a model — completely based on mass action kinetics — that gathers 
197 species and 474 reactions [36]. This model was obtained via a rule-based modeling 
approach. 


8.2 Agents, Attributes, Rules, and Patterns 


Rule-based modeling is a very efficient method to build models for circuits that are made of 
a large number of species and reactions. This technique demands only an abstract, concise 
description of the types of molecules present in the circuit and their interactions. All actual 
species and reactions are then derived automatically. Software such as BioNetGen [11] and 
Kappa [12]° provide rule-based modeling implementations. 

In a rule-based modeling approach, molecules are agents (i.e. independent objects with 
a well-defined structure) and molecular interactions are described by rules that specify 
changes in the attributes of the agents. Attributes are the states where functional sites of an 
agent can lie. For instance, we can represent a repressor R as an object having three sites: 
a DNA binding site (b), a dimerization site (d), and an activation site (a) where chemicals 
bind. Site b can be in two states: “bound” to the DNA or “unbound”. Site a can lie into 
two states as well: “taken” by a chemical or “free”. Site d does not require to be associated 
with any state. Chemicals (C) can be treated, for the sake of simplicity, as structureless 
molecules i.e. they do not have any sites (see Fig. 8.6). This does not prevent them from 
interacting with repressors, as we will show below. In BNGL (BioNetGen Language) R 
and C are molecule types and are denoted as R(b ~ bound ~ unbound,d,a ~ free ~ 
taken) and C(), respectively. 


3The syntax of the languages used by these two pieces of software is very similar. We made use of 
BioNetGen in order to generate the model for the RNAi-based logic evaluator. 
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A pattern permits to select a particular attribute of a molecule. For instance, the pattern 
R(b ~ unbound) matches all repressors R not bound to any promoter, no matter if they 
are in a monomeric or dimeric configuration and if they are activated or not by a chemical. 
A rule represents a bio-chemical interaction by describing, with patterns, the changes in 
the sites of the interacting molecules. For instance, dimer formation and dissociation (with 
rate constants 6 and €, respectively) are written in BNGL as 


R(b ~ unbound, d) + R(b ~ unbound, d)<R(b ~ unbound, d!1).R(b 
~ unbound, d!1) 6,€ 


where we require only that the repressors are detached from the DNA, whereas it does 
not matter if they are bound or not to a chemical. A new species, the repressor dimer, 
is indicated by joining the two monomers with a dot (.’). Moreover, the bond between 
the two reactants is represented as a change in the dimerization site d. Indeed, d is now 
followed by an exclamation mark (“!’’) and a number (the bond ID). 

As another example, the binding/unbinding interactions (with rate constants 1 and 2) 


between a chemical and a repressor monomer can be written as 
CO+ R(a~ free)R(a~ taken) dh, 


By processing rules, BioNetGen derives all species and interactions present in a 
biological system. As an input, BioNetGen requires a file where molecule types, rules, 
and seed species are declared. The latter are the molecules present at the beginning of 
a simulation (for instance C() and R(b ~ unbound,d,a ~ free). They have to be 
accompanied by their concentrations. 

On the whole, rule-based modeling permits to represent a biological system of high 
complexity in an intuitive way. This requires to sketch the kind of molecules involved 
in the system together with the possible states of their functional sites, the molecules 
present at the beginning of the system simulation, and rules that underline how molecules 
interact to each other. The actual species and reactions are then computed by software, 
developed ad hoc, that can also carry out deterministic and stochastic simulations of the 
system dynamics. 
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8.3 ARule-Based Model for the Repressilator 


The model for the Repressilator in Eq. (6.6) describes the dynamics of each transcription 
unit i, i : 1,...,3 via two ordinary differential equations, one for the mRNA .4@, the 
other for the repressor protein ;. Repression of the synthesis of mRNA .4@ by repressor 
P 3 is represented by a Hill function. How to model a Repressilator transcription unit, 
according to mass action kinetics, if we suppose that each promoter hosts two operators 
where the corresponding repressor binds? 

The ith promoter should be described by the state of its two operators, O1; and O2;. 
Each operator can be either free (f) from or taken (t) by repressor molecules. Hence, a 
single promoter can lie in four different configurations: both operators are free (O if oO), 
only one of the two operators is taken (or 05 , and 0; F of ), and both operators are taken 
( 0} F O8;): In principle, only the configuration of of allows the production of mRNA 
(m;). However, we can consider that the other three promoter states, where at least one 
operator is bound to repressor R;, contribute to the promoter leakage. Beside mRNA 
decay, we shall consider the degradation of both free and DNA-bound repressor molecules. 

Overall, the reactions involved in a Repressilator transcription unit are 


km 
O},03, > Of, 03, + mi 


Z kik f 
0}, 05; =. 0}; 94; +m; 
-& 
ot.of “ ot. of +m; 
ei 
01,05; —> 01,05; + mi 


kr 
m; —> m+ R; 


fpr @2 f 
Rj + 01,03, = 04; 03; 


iar ee Ar 
Rj +0j,03, = 07,03; 


(a, B) 
Rj + OL0;, = 01,03; 


fot Ob 
Rj + 0}, 05; ~ 0}; 05; 


kam 
— 


kar 
—_ 


4Remember that (i, j) =, 2), (2, 3), (G, 1). 
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01,05, 4 of,0%, , (8.1) 


where k,, is the transcription rate constant, kj), the transcription-leakage rate constant, kp 
the translation rate constant, @ represents the repressor-operator binding rate constant, 6 
the repressor-operator unbinding rate constant, kg», is the mRNA decay rate constant, and 
kar the repressor degradation rate constant. For the sake of simplicity, in Eq. (8.1) we have 
assumed that repressor R; has the same affinity towards each of the two operators and R; 
decay rate is always the same, no matter if the repressor is bound to the DNA or free in the 
bacterial cytoplasm. 

From Eq. (8.1) it is apparent that a complete model for the core Repressilator circuit (i.e. 
without taking into account the transcription unit that leads to fluorescence expression) 
requires 18 species and 57 reactions. We will calculate them by using BioNetGen). 
BioNetGen takes, as an input, a script file in the so-called BioNetGen language (extension: 
“bngl’) and returns, as an output, a file (with extension “net’”) that summarizes the species 
and the interactions present in the system under study [16]. BioNetGen can also write 
a model description in the Systems Biology Markup Language (SBML) [26] that is 
supported by many computational tools for Systems and Synthetic Biology. We will use 
the SBML file generated by BioNetGen in order to simulate the repressilator dynamics 
with COPASI. 


8.3.1 The Repressilator in the BioNetGen Language 


A BNGL file is structured into sections delimited by the tags begin and end. Comments 
are preceded by the symbol #. At the beginning of the “repressilator.bng]” file we write the 
definition of the compartment (the whole bacterial cell) that hosts the circuit 


begin compartments 
ecoli 3 1.66e-15 # litres 
end compartments 


With the above instructions, we name the compartment as “ecoli” and specify both its 
dimensionality (“3”) and volume (in litres, as reported by the comment at the end of the 
second line). Compartment definition is not mandatory but useful when the output is in the 
SBML format. 


SBioNetGen is freely available at http://www.bionetgen.org. Here we use BioNetGen-2.3.1. 
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In the second section of our BNGL file, we list all parameters present in our model: 


initial concentrations, rate constants, and the compartment volume again 


begin parameters 


pl te. 
p2_ic 1. 
p3_ic 1. 
R1l_ic 1. 
km 0.5 


Oe-09 
Oe-09 
Oe-09 
Oe-08 


klk 5.0e-05 


kR 0.02 
alpha 1. 
beta 9 


Oe9 


kdm 0.0058 
kdR 0.00116 


volume 1.66e-15 


end parameters 


We refer to the three promoters as pl, p2, and p3. Their operators will be treated as 


functional sites. The initial concentration of each promoter (pN_ic, where N = 1,..., 3) 
is set to 10~° M i.e. promoters are present in a single copy inside the cell. Units are not 
specified in BNGL.° 


The next section of the BNGL file includes a list of the molecule types i.e. the agents in 


our model 


begin molecule types 
pl(O01~free~taken,02~free~taken) 
O1~free~taken, 02~free~taken) 


Pp 


p3 (Ol~free~taken,02~free~taken) 


2 ( 
( 
m1 () 
m2 () 
m3 () 
R1 () 
R2 () 
R3 () 

Trash () 


end molecule types 


Only the three promoters are characterized by functional sites, their two operators. In 


the above code we had to specify all possible states in which an operator can lie (free 
or taken). The other agents in the model are all structureless. “Trash()” is a fake species 


required by BioNetGen for representing degradation reactions, as we will see below. 


®In our model, km, kiz, KR; B, Kdm, and kgR are expressed in s—!, whereas a in M~! s—!. Numerical 
values are taken from [13, 33]. 
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After defining the molecules types, we can add the seed species to the BNGL file. In the 
Repressilator, at the beginning of a simulation we need four species: the three promoters 
(where both operators are “free”) and one of the three repressors,’ to break the circuit 
symmetry and trigger oscillations 


begin seed species 
pl(O1l~free,O02~free)@ecoli pl_ic 
p2(O1l~free,O02~free)@ecoli p2_ic 
p3 (Ol~free,O02~free)@ecoli p3_ic 
R1()@ecoli R1_ic 
end seed species 


The notation “@ecoli” is used to indicate that the four seed species are located into the 
“ecoli” compartment. 

The last section of a BNGL file is reserved to the reaction rules. They are an abstract 
representation of the interactions listed in Eq. (8.1). 


begin reaction rules 
pl(O1l~free,O2~free) -> pl(Ol~free,O2~free) + m1() 
km«xvolume 
p2(O1l~free,O2~free) -> p2(O01l~free,O2~free) + m2() 
km«volume 
p3(O1l~free,O2~free) -> p3(O1l~free,O2~free) + m3() 
km«volume 


pl(Ol~taken) -> p1(Ol~taken) + m1() klk«xvolume 

pl(O02~taken) -> p1(O2~taken) + m1() klk«xvolume 

p2(O1l~taken) -> p2(O1l~taken) + m2() klk«xvolume 

p2(02~taken) -> p2(O02~taken) + m2() klk«volume 

p3(O1l~taken) -> p3(Ol~taken) + m3() klk«volume 

p3(O2~taken) -> p3(O02~taken) + m3() klk*xvolume 

m1() -> m1() + R1() kR*volume 

m2() -> m2() + R2() kR*volume 

m3() -> m3() + R3() kR*volume 

R1() + p2(Ol~free) <-> p2(Ol~taken) alphasxvolume, 
betaxvolume 

R1() + p2(02~free) <-> p2(O02~taken) alphaxvolume, 
betaxvolume 

R2() + p3(Ol~free) <-> p3(Ol~taken) alphasxvolume, 
betaxvolume 

R2() + p3(02~free) <-> p3(O02~taken) alphaxvolume, 
betaxvolume 


TR 1, Whose initial concentration—-specified in the “parameters” section—corresponds to 10 molecules. 
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R3() + pl(Ol~free) <-> pl(Ol~taken) alphasxvolume, 


betaxvolume 
R3() + pl(O2~free) <-> pl(O2~taken) alphaxvolume, 

betaxvolume 

1() -> Trash() kdm«volume 

2() -> Trash() kdms*volume 

3() -> Trash() kdmsvolume 

1() -> Trash() kdR«x«volume 

2() -> Trash() kdR*volume 

3() -> Trash() kdR*volume 
porate => oe kdR*volume 

1(02~taken) -> p1l(O02~free) kdR*volume 
Sa era -> p2(O1~free) kdR*«volume 
aa eee -> p2(02~free) kdR«xvolume 

3(Ol~taken) -> p3(Ol~free) kdR«xvolume 
ene -> p3(02~free) kdR*xvolume 


end reaction a 


DNA transcription is the only reaction that demands to specify the state of both 
operators. Repressor binding to and unbinding from an operator is not influenced by 
the state of the other operator. Similarly, the degradation of operator-bound repressors is 
independent of the state of the operator nearby. Each rule is associated with a rate constant. 
Here, we multiply the rate constants by the compartment volume to let COPASI calculate 
the reaction rates properly. 

The very last instructions included into a BNGL file are called actions but do not 
demand the begin and end tags. 


# actions 
generate network (overwrite=>1) 
writeSBML () 


The action “generate_network” processes the content of a BNGL file and writes all 
species and reactions (together with the various parameters) into a NET file. The option 
“overwrite”, if set to 1, allows replacing an existent “repressilator.net” file. writeSBML(), 
finally, makes BioNetGen write the output file “repressilator.xml” in the SBML format.® 

Figure 8.7 shows the dynamics of the Repressilator according to the model generated 
by BioNetGen. Interestingly, sustained oscillations arise both from deterministic and 
stochastic simulations, even though cooperativity was not taken into account explicitly 
in any rule or kinetic parameter specified in the BNGL file. 


8BioNetGen is run from a console. Place the “repressilator.bngl” file into the folder that contains 
BioNetGen and then type “./BNG2.pl repressilator.bngl”. 
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Fig. 8.7 Deterministic (a) and stochastic (b) simulations of the Repressilator with COPASI 


8.4 


Protein-Protein and Chemical-Protein Interactions: A Model for 
the Toggle Switch 


As described in Chap. 6, the Toggle Switch is made of two transcription units that repress 
each other. Moreover, each repressor can be inhibited by an input signal (i.e. a chemical or 
a thermal pulse). Equation (7.3) presented a simplified model for the Toggle Switch where 
translation was treated as a single-step event and Hill functions were used to represent 
translation down-regulation. With BioNetGen we can derived a more detailed mechanistic 
model, fully based on mass action kinetics, that lies on the following assumptions 


each repressor binds its target promoter only as a dimer, 
upon chemical binding, a repressor monomer is no longer able to dimerize, 
the degradation of dimers (bound and unbound to the DNA) and monomers takes place 


at the same rate. 


each promoter contains two operators and each operator can be into two different states: 
free and taken by a repressor, 
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In this framework, a repressor becomes a structured agent as in Fig. 8.6. Chemicals, 
in contrast, can be treated as stuctureless species. According to the notation in Chap. 6, 
we set that the promoter p;, which is regulated by repressor R,, leads the production of 
mRNA mz that is translated into the repressor R2. By binding p2, Ro suppress m, and, as 
a consequence, R, synthesis. Finally, the inducer chemical i; binds and inactivates Rj; i2 
interacts with R2 in the same way. 

By following, wherever possible, the same notation as in the Repressilator case above, 
the parameters for the Toggle Switch become in BNGL 


begin parameters 
pl_ic 1.0e-09 
p2_ic 1.0e-09 
i_ic 0 
kml 0.05 
km2 0.03 
klk 5.0e-05 
kR 0.02 
alpha 1.0e9 
beta 9 
delta 5.0e5 
epsilon 2.0e-05 
lambda 1.0e6 
mu 0.01 
gamma 1.0e6 
kdm 0.0058 
kdR 0.00116 
kdi 3.0e-05 
volume 1.66e-15 
end parameters 


We require that the two promoters are present in a single copy within a bacterial cell.” 
The only difference between them is in the transcription initiation rate with k,,; slightly 
higher than k,,2. Hence, in the absence of chemicals in the system, R2 is expected to 
prevail on R;. 6 and € are the dimer formation and dissociation rate constants; A and jz are 
the chemical-repressor binding and unbinding rate constants. Finally, kg; is the chemical 
decay rate constant and ic_c the chemical initial concentration (set to zero). Numerical 
values are taken from [33]. 

As for the molecule type, we need to include the new species i; and i2 representing 
the chemicals that bind R; and R2, respectively. Moreover, as mentioned above, we shall 
specify three functional sites in the repressor structure: “dna” represents the DNA binding 


°The “toggle_switch.bngl” file starts with a compartment definition identical to the one written for 
the Repressilator. 
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domain and lies into two possible states: bound and unbound; “‘s” is the chemical binding 
domain. This site can be either “active” (i.e. the chemical is not bound to the repressor) or 
“inactive” (the chemical has made a complex with the repressor). The name of the states 
of this site reflects the capability of the repressor to bind or not the operators of the target 
promoter; “dim’’, finally, is the dimerization domain and serves to establish a bond between 
two monomers. Overall, the molecule type definition is written as 


begin molecule types 
pl(01~free~taken,02~free~taken) 
p2 (O1~free~taken,02~free~taken) 
m1 () 
m2 () 
R1 (dna~bound~unbound, s~active~inactive, dim) 
R2 (dna~bound~unbound, s~active~inactive, dim) 
i1() 
i2() 
Trash () 
end molecule types 


Among the seed species, we need to include the two chemicals together with the 
promoters in their free-of-repressor configuration, as show below 


begin seed species 

pl(O1~free,O02~free)@ecoli pl_ic 

p2(Ol~free,O02~free)@ecoli p2_ic 

i1()@ecoli i_ic 

i2()@ecoli i_ic 
end seed species 

The presence of both chemical is here necessary since no reactions that accounts for 
chemical production is present in our model. Finally, the “reaction rules” can be divided 
in several blocks, as in the following 
begin reaction rules 

# transcription and translation 

pl(O1l~free,O2~free) -> pl(O1l~free,O2~free) + m2() 


km1*volume 
p2(O1l~free,O2~free) -> p2(O1l~free,O2~free) + m1() 
km2*volume 
pl(Ol~taken) -> p1(Ol~taken) + m2() klk«xvolume 
pl(02~taken) -> p1l(O2~taken) + m2() klk«xvolume 
p2(O1l~taken) -> p2(Ol~taken) + m1() klk«xvolume 
p2(02~taken) -> p2(O02~taken) + m1() klk*xvolume 
m1() -> m1l() + R1(dna~unbound,s~active,dim) kR«xvolume 
m2() -> m2() + R2(dna~unbound,s~active,dim) kR«xvolume 
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# chemicals inactivating monomers 


i1() + R1(dna~unbound, s~active,dim) <-> R1(dna~unbound, 


s~winactive,dim) lambdaxvolume,mus*volume 


12() + R2(dna~unbound, s~active,dim) <-> R2(dna~unbound, 


s~winactive,dim) lambdaxvolume, muxvolume 

# repressor dimerization 

R1 (dna~unbound, svactive,dim) + R1(dna~unbound, 
s~active,dim) <-> R1(dna~unbound,s~active,dim!1). 
R1 (dna~unbound, s~wactive,dim!1) deltaxvolume, 
epsilon«volume 

R2 (dna~unbound, svactive,dim) + R2(dna~unbound, 
s~active,dim) <-> R2(dna~unbound,s~active,dim!1). 
R2 (dna~unbound, s~wactive,dim!1) deltasxvolume, 
epsilon«volume 

# dimers binding promoter operators 


R1 (dna~unbound, s~vactive,dim!1) .R1(dna~unbound, s~active, 


dim!1) + pl(Ol~free) <-> p1(O1~taken) 
alphasxvolume,betaxvolume 

R1 (dna~unbound, svactive,dim!1) .R1(dna~unbound, 
s~active,dim!1) + p1(02~free) <-> p1(02~taken) 
alphasxvolume, beta*xvolume 

R2 (dna~unbound, svactive,dim!1) .R2(dna~unbound, 
s~active,dim!1) + p2(O1l~free) <-> p2(01l~taken) 
alphasxvolume, beta*xvolume 

R2 (dna~unbound, svactive,dim!1) .R2(dna~unbound, 
s~active,dim!1) + p2(02~free) <-> p2(02~taken) 
alphasxvolume, beta*xvolume 

# degradations 


m1() -> Trash() kdms*«volume 
m2() -> Trash() kdms*«volume 
R 


(dna~unbound, s~active,dim) -> Trash() kdR«volume 
(dna~unbound, s~vactive,dim) -> Trash() kdR«volume 
R1(dna~unbound,s~inactive,dim) -> i1() kdR«volume 


R2(dna~unbound,s~inactive,dim) -> i2() kdR*«volume 
4] 
i1() -> Trash() kdixvolume 


( 
( 
1() -> Trash() kdixvolume 
( 
( 


R1 (dna~unbound, svactive,dim!1) .R1(dna~unbound, 


s~active,dim!1) -> Trash() kdR*volume 
R2 (dna~unbound, svactive,dim!1) .R2(dna~unbound, 
s~active,dim!1) -> Trash() kdR*volume 


pl(Ol~taken) -> p1l(O1l~free) kdR*volume 
pl(02~taken) -> pl1(02~free) kdR*volume 
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Fig. 8.8 Toggle switch simulations with COPASI. Protein numbers are calculated at different steady 
states. Two consecutive steady states are separated by 5 104s 


p2(Ol~taken) -> p2(O1l~free) kdR*volume 
p2(02~taken) -> p2(02~free) kdR*xvolume 
end reaction rules 


For the sake of simplicity, we neglected the possibility of chemicals binding to dimers. 
The Toggle Switch model, generated by BioNetGen, is made of 18 species and 54 
reactions. We can run simulations with COPASI in a way similar to that in Chap. 7. First, 
we let the circuit reach a steady state in the absence of chemicals (model time: 5 10*s). 
R> is produced in moderately high quantity (about 60 molecules), anyway enough to shut 
off R, synthesis. At this point, we induce the system with 10> M of i2 and let the circuit 
evolve for 5 104 more seconds. The circuit switches to a steady state where R| is expressed 
(around 36 molecules) and R2 is absent. We set, now, the concentration of i; to 10->M 
and i2 to 0. As a result, the circuit settles back to a steady state where Ro is rather high 
(almost 63 molecules) and R practically absent. If we set, this time, i2 to 10-> M and 
i, to 0, we re-obtain the previous steady-state where the number of molecules of R; was 
roughly 36. By removing both chemicals from the system, the circuit goes back to the 
initial steady state characterized by the presence of approximatively 60 molecules of Ro. 
This was expected since k2 > km (see Fig. 8.8). 


Take Home Message 

¢ The rule-based modeling approach is extremely useful to generate the list of species 
and reactions of complex synthetic and natural biological networks. 

¢ Software implementing rule-based modeling need only a concise system description in 
order to compute a detailed mechanistic model. 
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¢ In synthetic gene circuits, rule-based modeling is effective to tackle the combinatorial 
explosion of species and interaction number. This problem arises, for instance, when 
multiple operators are present along promoter sequences. 


Check for 
updates 


What You Will Learn in This Chapter 

The models for the Repressilator and the Toggle Switch that we have built in Chap. 8 are 
not based on the notion of DNA parts and the two circuits are treated just as biological 
systems made of interacting molecules. However, as we have seen in Chap. 1, an ultimate 
goal of Synthetic Biology is to developed methods for the modular design and modeling of 
genetic circuits, where basic components — DNA sequences and pools of signal carriers — 
are linked together in a visual way, as it is usually done with electronic circuits. One of the 
first attempts in this direction is represented by the Parts & Pools software, the Synthetic 
Biology add-on of ProMoT, a computational tool for the modular design and analysis 
of complex systems. In the following, we will see how the Repressilator and the Toggle 
Switch can be designed in a “drag and drop” way with Parts & Pools. First, DNA parts 
and pools are displayed on the canvas provided by ProMoT. Then, they are connected to 
each other with wires were the signal carrier molecules are imagined to flow. Each circuit 
component is associated with a pre-existent mathematical model, in which a user can only 
change parameter values. A model for the whole circuit is generated by ProMoT via the 
composition of the models of the circuit components. The circuit model can be exported, 
finally, into the SBML format and used for simulations and analysis with COPASI. 


9.1 Parts & Pools 


The first version of Parts & Pools [33,34], which is fully integrated into the open-source 
software ProMoT [39],! was developed for the visual modular design of synthetic gene 
circuits in bacterial cells. Parts & Pools is a collection of Perl scripts that generate ODE- 


'ProMoT is freely available at http://www2.mpi-magdeburg.mpg.de/projects/promot/. 
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based mathematical models for bacterial Standard Biological Parts (promoters, ribosome 
binding sites, protein-coding sequences, sequence encoding for small RNAs, DNA spac- 
ers, and terminators), pools of signal carriers (RNA polymerases, ribosomes, transcription 
factors, small RNAs, and chemicals), and a particular pool that hosts enzymatic reactions. 
Each Perl scripts takes, as an input, a text file where the user can specify both some features 
(e.g. if a promoter is constitutive or regulated) and the kinetic parameter values of the 
corresponding part or pool. The output is an MDL (model definition language — the internal 
language of ProMoT) file that contains a mathematical model for the chosen part/pool and 
terminals (i.e. interfaces) that permit the visual composition (wiring) of parts and pools 
into synthetic gene circuits. 

As we have seen in Chap. 1, parts are defined as DNA sequences with a well-defined 
function in transcription or translation, whereas pools are the abstract places where 
free molecules of signal carriers are stored. Parts are composed into higher modules, 
the transcription units, that interact by exchanging molecules of signal carriers such 
as transcription factors and small RNAs. Hence, in a circuit scheme, pools of signal 
carriers are the graphical interfaces among transcription units. Exceptions are the pools 
of chemicals (or environmental signals) that put in communication the whole circuit with 
the surrounding environment. Furthermore, since the content of any pool plays a non- 
negligible role in determining the circuit dynamics, pools can be considered as biological 
batteries. 

Synthetic gene circuit design demands, first, to generate all necessary parts and pools. 
They are, then, displayed, in a “drag and drop” way, on the ProMoT visual editor. Finally, 
matching terminals of different parts and pools are wired together in order to close the 
circuit. A mathematical model for the whole circuit arises from the composition of the 
models of its components and is encoded, by default, into an MDL file. The circuit model 
can be exported into the SBML format that, as we have seen in Chap. 8, can be used to run 
analysis and simulations with COPASI. 


9.2 Modular Modeling 


Parts and pools are modeled independently, according to mass action kinetics. They 
exchange fluxes of signal carriers. These fluxes are biological currents, shared input/output 
that allow parts and pools composition into genetic circuits. Parts that host interactions 
with signal carriers — such as the promoter that binds RNA polymerase — have also access 
to the value of the concentration of free signal carrier molecules into their corresponding 
pools. These concentrations are like biological potentials because they influence circuit 
performance. All these concepts are clarified below with the description of the model for 
a bacterial transcription unit that expresses a generic transcription factor. 
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Fig. 9.1 Symbols. The 


original icons adopted at the 
MIT Registry to represent «pe. a me =| a 


Standard Biological Parts are 


employed in the first version of Promoter RBS CDS Terminator 
Parts & Pools. A transcription 

unit, as a biological module of 

higher complexity, is 

associated with the symbol of a 2 


“protein generator” ee ee 


Transcription unit (protein generator) 


9.2.1 Bacterial Transcription Unit 


A bacterial transcription unit encoding for a protein is made of four parts: promoter, 
RBS, CDS, and terminator (see Fig.9.1). RNA polymerases go through all parts on a 
DNA sequence, ribosomes scan the parts transcribed into mRNA. RNA polymerases 
recognize and bind a promoter at the so-called —35... — 10 region. In the Parts & Pools 
modeling framework — as shown in Fig. 9.2 — free RNA polymerase molecules, pol/’®, 
are supposed to interact with free promoter sequences, P, according to a Michaelis-Menten 
scheme. They form a complex, [Ppol], from which RNA polymerases enter the promoter 
clearance phase and start DNA transcription. A promoter generates two fluxes of RNA 
polymerases: a balance flux (Po P'S”) exchanged with the RNA polymerase pool and an 
output flux (Po PS°’) towards the RBS. The former is due to RNA polymerase binding to 
(rate constant k,) and unbinding from (rate constant k_;) the promoter. The latter equals 
k[P pol], where ky is the promoter transcription initiation rate.” 

The PoPS°’ flux generated by the promoter flows into a new complex, [Pol B], 
inside the RBS. From [PolB] RNA polymerases initiate the elongation phase that is 
accompanied by the synthesis of mRNA, here indicated as b. As soon as the DNA is 
transcribed, free ribosome molecules, rJt ee have access to the Shine-Dalgarno sequence 
and form a complex, [rb], with the nascent mRNA chain. In analogy with the promoter- 
RNA polymerase system, ribosomes and mRNA interact by following a Michaelis-Menten 
scheme as well.? The RBS is characterized by three fluxes: a balance flux of ribosomes 
(RiP S° ) exchanged with the ribosome pool (kj, and k_j, in Fig.9.2 are the ribosome- 
mRNA association and dissociation rate constants) and two output fluxes (PoP S° and 


Despite the fact that RNA polymerase is an enzyme, in the Michaelis-Menten scheme that describes 
the interaction between RNA polymerase and promoter, the promoter plays the role of the enzyme 
E and the RNA polymerase that of the substrate S — see Eq. (2.46). 


3 Also in this case the ribosomes are the substrate for the Michaelis-Menten scheme and the mRNA 
corresponds to the reaction enzyme. 
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Fig. 9.2. Modeling with Parts & Pools: fluxes and concentrations of free molecules. Promoter, RBS, 
CDS, and terminator are assembled into a transcription unit encoding for a transcription factor 
protein. The promoter and the terminator are connected to the RNA polymerase pool, whereas the 
RBS and the CDS are linked to the ribosome pool. The CDS is wired to a transcription factor pool as 
well. Promoters have access to the concentration of free RNA polymerase molecules (into the RNA 
polymerase pool). Promoter-RNA polymerase interactions determine the formation of a balance flux, 
PoPS®.Inan analogous way, the RBS “knows” the amount of free ribosomes into the corresponding 
pool. RBS-ribosomes binding/unbinding reactions are associated with a Ri PS? flux. Further fluxes 
of RNA polymerases (here indicated as Po P S°") connect all DNA parts along the transcription unit 
and the terminator with the RNA polymerase pool. A ribosome flux (Ri PS°“’) goes from the RBS 
to the CDS, the two parts transcribed into mRNA, and from the CDS to the ribosome pool. Moreover, 
a FaPS°“' flux connects the CDS to the corresponding transcription factor pool. A module for the 
mRNA is absent but mRNA dynamics is computed correctly via the time derivative of the species b 
inside the RBS 


Ri PS‘) towards the CDS. The former is equal to k.j[PolB], where k.; is the RNA 
polymerase elongation rate, the latter corresponds to k2;[rb], where kz, represents the 
translation initiation rate. Both RNA polymerases and ribosomes flow into new complexes 
belonging to the CDS. They are named [Pol A] and [ra], respectively (“‘A” stands for the 
adenine in the START codon). RNA polymerases leave the [PolA] complex and flow 
into the terminator (PoP S°“’ = k,;[PolA]). Ribosomes move from the START up to the 
STOP codon, where they form a new complex with the mRNA chain ([ru], where u is 
the uracyl at the beginning of the STOP codon) before getting free and flowing back to 
their pool (RiP S°” = ¢,[rb], where ¢, is the translation termination rate). If the CDS 
encodes for a transcription factor, this protein is released when the ribosomes detach from 
the mRNA. A FaPS° flux towards a transcription factor pool is then generated and 
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equals in value the Ri PS°“’ flux. RNA polymerases form a complex, named [PolT], into 
the terminator and then leave the double chain. Hence, the terminator sends to the RNA 
polymerase pool a PoP S°* flux that amounts to ¢[PolT], where ¢ is the transcription 
termination rate. 

It should be noted that, into each pool, the time derivative of the concentration of free 


‘ . : : dpolfree 2 
signal carriers is calculated as a sum of fluxes. For instance, “ 7— contains as many 


PoPS’ + PoPS'"* terms as there are transcription units into a synthetic gene circuit. 

In the MDL files, terminals handle fluxes of signal carriers and concentrations of 
free signal carriers. These quantities are exchanged between circuit components, whose 
terminals are wired together, and represent a common currency that makes parts and pools 
composable. 


9.3. Designing the Repressilator with Parts & Pools 


In order to design the core Repressilator circuit we need 12 parts organized into three 
transcription units. Moreover, we need 5 pools: one for each of the three repressors 
expressed by the circuit transcription units together with the RNA polymerase and the 
ribosome pool. If we assume that the three transcription units are identical (as we did both 
in Chaps. 6 and 8) we need to generate only 4 parts (one for each category) and a single 
transcription factor pool.? As we will see below, parts and pools can be easily re-used 
within a circuit scheme. 


9.3.1 The Promoter 


First, let us create a directory “repressilator’” where to place all files for the Repressilator 


components. After running ProMoT?® click on “File” then on “Run Script...”, a new win- 
dow (named “Run Script’) appears. Choose as “Generation Directory” the “repressilator” 
one.’ As for “Script File”, after clicking on “...” a new window labelled as “Open” pops 


up. Select the ProMoT installation directory among the ones into the “Bookmarks” menu 
on the right. Then, in the left panel, double-click on the “scripts” directory and then on 
the “synthetic-biology” one. Here, select the script “promoter_gen.pl” and then click on 
“Open”. As for “Input File”, repeat the same steps as above for “Script File” and select the 


4 PoPS'" corresponds to a PoP S°" from a terminator. 
5 Wherever possible, we will use the same parameter values as in Chap. 8. 


Here we use ProMoT version 0.8.5 on Linux (Ubuntu) operating system. To launch Pro- 
MoT graphical user interface, you should run the script “promot-ui” located in the directory 
“fasr/local/promot/scripts/” if you have made a default ProMoT installation. 

TBy clicking on “...” on the left of each entry in this window you can browse the directories in your 
computer. 
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file “‘promoter.inp” before clicking on “Open”. Back to the “Run Script” window, click on 
“Edit Input File”. The “Text Editor for promoter.inp” window shows up on the screen. Here 
you can set the structure of the promoter and the numerical values of the rate constants 
associated with the biochemical reactions that take place inside the promoter part.® 

Every part and pool is modeled according to mass action kinetics. A promoter can 
be either constitutive or regulated. The latter case implies the presence of (maximum 
2) operators along the promoter sequence.? O, is the operator closest to the TSS 
(transcription start site) and is supposed to have the strongest affinity towards repressors. 
O2, in contrast, shows the strongest affinity towards activators. In case of cooperativity 
between two repressors (no matter if they are of the same kind or not), O; binding 
provokes an increase in the affinity of O2 towards its corresponding repressor. In contrast, 
if cooperativity is between two activators, O2 occupation enhances the affinity of O; 
towards its activator. Cooperativity is mimicked by a change in the transcription factor- 
operator binding and unbinding rate constants. A promoter can also be both activated 
and repressed but, in this case, the repressor is allowed to bind O; only and the activator 
O>2. Activators always show either cooperativity or synergistic activation (see Chap. 12). 
The latter provokes a change in the transcription initiation rate constant and in both RNA 
polymerase-promoter binding and unbinding rate constants in order to reproduce a strong 
enhancement in gene expression. 

In analogy with the Repressilator model in Chap. 8, we demand that each promoter 
contains two operators bound by a unique repressor species. Moreover, we neglect any 
cooperativity effects. In the file “promoter.inp”, we set the “promoter name” to “p_r’, 
for instance.!® “Operator number” is, in our case, 2, “transcription factor number” 1. 
“Ol transcription factor” and “O2 transcription factor” should be set both to “Ral”. 
The notation for transcription factors and corresponding chemicals is based on [31] and 
follows strict rules. Transcription factors can be named as: “Ra” (active repressor), “Ri” 
(inactive repressor), “Aa” (active activator), and “Ai” (inactive activator). Their name shall 
be followed by the number of the operator they bind.!'! Signals (i.e. the chemical binding 
the transcription factors) can be either J (inducers) or C (corepressors). This letter shall be 
followed by the same number associated with the transcription factor they interact with. 
Inducers bind either active repressors or inactive activators (i.e. they promote transcription 


Sif you prefer, you can copy all Perl scripts, input files, and icon files (PNG) from 
“/usr/local/promot/kb/scripts/synthetic-biology/” to the “repressilator” directory and run the perl 
scripts from a terminal. 

°A more recent version of Parts & Pools — developed mainly for eukaryotic gene circuit design [36] 
— does not put any limit on the operator number. 

10Rach string or number you write into a Parts & Pools input file shall not be included between 
quotation marks nor apices. Moreover, it must be separated by a blank space from the semicolon on 
its left. 

1 y¢, like in this case, the same transcription factor binds both operators, its name should be followed 
by the number 1. 
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initiation); corepressors interact with inactive repressors or active activators (hence, they 
prevent transcription). Notice that, after binding chemicals, transcription factors change 
their configuration from active to inactive or vice versa. Even though we do not make 
use of chemicals in our Repressilator circuit, out of consistency we shall set both “Ol 
signal” and “O2 signal” to “Il”. Since we decided to ignore any cooperative effects, 
“cooperativity” is set to “n” (no); “synergistic activation” is “n” as well because we are 
dealing with repressors. “Readthrough” and “promoter flag” shall be “n’” too. The former 
can be set to “y” (yes) only when a promoter is preceded by a terminator belonging 
to another transcription unit,!* the latter is “y” only in the particular case of a double 
promoter at the beginning of a transcription unit. “P_free” is the promoter concentration 
(in Molars). For a single copy, we shall set it to “1d-9”.!° As in BNGL files (see Chap. 8), 
units shall not be written into any input files for Parts & Pools. They are, however, defined 
in the corresponding Parts & Pools scripts and written into MDL files. “alphal” and 
“betal” are the transcription factor-operator binding and unbinding rate constants for a 
promoter that contains a single operator. Since we have set the operator number to 2, these 
two parameters will be ignored by the “promoter_gen.pl” script. “alphalf” and “betalf” 
represent, in our system, Ral-O; binding and unbinding rate constants when Op? is free. 
“alphalt” and “betalt” are the Ral-O, binding and unbinding rate constants when QO? is 
taken; “alpha2f” and “beta2f correspond to Ral-Oz binding and unbinding rate constants 
when (QO, is free. Finally, “alpha2t’” and “beta2t” stand for Ral-O2 binding and unbinding 
rate constants when Qj is taken. In the absence of cooperativity, we can set the four 
“alphas” to “1d9” and the four “betas” to “9”. 

Since, according to Parts & Pools assumptions, O, has a stronger affinity than O2 
towards Ral, “cooperativity” set to “y”’ would demand to have “alpha2f” lower than 
“alphalf”, “alphalt’,'* and “alpha2t”. Analogously, “beta2f’ would be higher than 
“betalf’, “betalt’, and “beta2t”.> 

“Lambdal” and “mul” are the I1-Ral binding and unbinding rate constants. Since 
there are no chemicals in the Repressilator circuit, we can set both rate constants to 
0. “Lambda2” and “mu2” will be ignored because, in our promoter, we have a single 
transcription factor. 

“k_d1” and “k_d2” are the decay rate constants of the transcription factors bound to 
O; and Oz, respectively. In our case, they both refer to Ral and are set to the same value: 
“0.00116”. 


!2Readthrough means that RNA polymerase molecules do not leave the DNA after meeting 
a transcription termination signal but flow into an adjacent promoter without ceasing mRNA 
production. 


13MDL used d instead of e to represent powers of 10. 
l4ccalphalf” and “alphalt” would have the same value. 


151f rate constant values are not consistent with cooperative effects, the “promoter_gen.pl” script 
returns a warning message. 
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“nl” is the number of molecules of I1 that bind a single molecule of Ral (complete 
cooperativity — see Chap. 2). We can set it to “0” since this interaction does not take place 
in our promoter. “n2” will be ignored. 

“gammal” and “gamma2” are the rate constants of the binding of a chemical to a 
transcription factor bound to O; and Og, respectively. In our promoter, in principle, they 
refer to I1 binding to Ral when Ral lies either on O; or on O2. However, since we do not 
have any chemicals in our circuit, both “gammas” can be set to zero. 

“k1” is the RNA polymerase-promoter binding rate constant, “k_1” is the RNA 
polymerase-promoter unbinding rate constant, whereas “k2” is the transcription initiation 
rate. We can set them to to “1d5”, “0.01”, and “0.5”, respectively. “klsa’”, “k_lsa”, and 
“k2sa” are the rate constants of the same reactions under synergistic activation, which 
is absent from our system.!° Finally, “k2_lk” is the leakage transcription initiation rate 
constant. As we did in Chap. 8, we set it to “Sd-5”. Each promoter configuration where at 
least one operator is taken gives a contribution to transcription leakage. 

After filling in all entries of the “promoter.inp” file, click on “Model” and then “Save”. 
Close the “Text Editor” window. Click on “Generate model”: the file “p_r.mdl” is created 
into the “repressilator” directory. 

The model of our promoter includes the signal I1. Hence, the “p_r.mdl” file contains a 
terminal that shall be connected to a signal pool. Since we do not have chemicals in the 
Repressilator circuit, we need to generate a “fake” signal pool where I1 concentration is 
kept equal to zero. To this aim, run the script “sips_off_gen.pl”. The file “sips_off.mdl” is 
then written. 


9.3.2 The RBS 


The RBS is a small DNA trait. It encodes for the Shine-Dalgarno sequence that lies 
about 10 nucleotides upstream of the START codon. Due to its complementarity to a 
ribosome region, the Shine-Dalgarno sequence (on the mRNA) is recognized and bound 
by ribosomes. Here, translation initiation takes place. 

By following the same procedure illustrated above, open the “rbs.inp” file. The first 
parameter, “RBS type”, shall be set to “n’”, which stands for normal. Normal RBS means 
that the RBS belongs to a monocistronic DNA sequence i.e. a single transcription unit. 
Hence, the RBS is preceded by a promoter and followed by a CDS. Bacteria show also 
polycistronic sequences where a unique promoter leads the expression of more than one 
gene. In this case, the RBS may be preceded either by a DNA spacer (the “RBS type” is 
“s”) or by a CDS (the “RBS type” is “n” but the parameter “pseudo-cistronic” shall be 


16Tp principle, it holds that kj,g > kj, kK—1sq < k_1, and kosq > ky. When these conditions are not 
respected only a warning message is returned by the “promoter_gen.p!” script. 
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set to “y”.!’) In this version of Parts & Pools the “RBS type” can be also set to “c” in 
order to mimic a particular kind of translation regulation based on the interaction between 
cis-repressing and trans-activating RNAs [18, 27, 38]. 

Let us set the “RBS name” as “rbs_r’. “polymerase readthrough” shall be set to “n” (as 
we did for “p_r” promoter to which “rbs_r’” will be connected); “ribosome readthrough” is 
“n” since this phenomenon can take place only along polycistronic sequences. “repressed 
by sRNA’” is also “n” because, in the Repressilator circuit, there is no interaction between 
small RNA molecules and the mRNA that leads to repression of translation.!® Both 
“pseudo-cistronic” and “first pseudo-cistron” are to be set to “n’” as well because we are 
not dealing with polycistronic sequences. 

“k_el” is the RNA polymerase elongation rate within the RBS, let us set it to “2”.!° 
“k_d” represents the mRNA decay rate. As in Chap. 8, we set it to “0.0058”. “kir’ and 
“k_ Ir” are the rlbosome-mRNA binding and unbinding rate constants. Following [33], we 
set them to “1.d6” and “0.01”, respectively. “k_2r” is the translation initiation rate. We 
set it to “0.02”, as in Chap. 8. The other parameters present in the file “rbs.inp” refer to 
interactions between the mRNA and different types of sRNA molecules. Therefore, we 
can ignore them. For an explanation of their meaning we refer the reader to [33, 37].7° 
After running the script “rbs_gen.pl’, the file “rbs_r.mdl” is created. 


9.3.3 The CDS 


This part can also be referred to as the “gene” since it encodes for proteins. Open 
the file “coding.inp”. The first entry, “Coding type” accepts five different characters: 
“p” (proteins without a specified function), “r’ (reporter proteins), “t” (transcription 
factors), “e” (enzymes), and “‘c” (proteins encoded into a polycistronic sequence). For the 
Repressilator circuit we have to choose “t’. In this way, the MDL file corresponding to each 
CDS contains a terminal for the connection to a transcription factor pool. We can ignore the 
next two entries (important for polycistronic sequences only) and set the “Coding name” 
to “rps_r’. “k_el” is the RNA polymerase elongation rate along this part. If, for the sake 
of simplicity, we assume that the three repressors in the circuit have the same length (640 
nucleotides), we can set “k_el” to “0.0625”. “k_el_r’ is the ribosome elongation rate along 


'7Here we call pseudo-polycistronic a multicistronic sequence where DNA spacers have been 
omitted. 

18The Parts & Pools modeling framework supposes that every post-transcriptional regulation takes 
place inside the RBS. This assumption does not provoke any loss of information even though, in 
living cells, small RNA molecules can be complementary to portions of the CDS or the 3’UTR. 
19This value comes from the assumption that the mean RBS length is of 20 nucleotides and the 
average RNA polymerase elongation speed in EF. coli equals 40 nt/s [28]. 

207 ater versions of Parts & Pools [35,36] use a different model for the RBS where interactions with 
riboswitches are also taken into account. 
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the CDS. By taking 35 nt/s as the average ribosome elongation speed in E. coli [28], we 
can set “k_el_r’ to “0.044”. “k_d’’, the protein degradation rate constant, shall be equal to 
“0.00116” since we used this value inside the “p_r” promoter above. Finally, “zeta_r” is the 
ribosome-mRNA dissociation rate constant at the STOP codon. We can keep the default 
value: “0.5” [19]. Run the script “coding_gen.pl” and the file “rps_r.md1” is created. 


9.3.4 The Terminator 


Terminators are placed at the end of a transcription unit. In bacteria, terminators are 
characterized by the presence of a palindromic sequence rich in guanines and cytosines 
that is followed by at least six thymines. Upon transcription, the G-C-rich motif folds into 
a hairpin that slows down RNA polymerase elongation. Then, a short U-A double stranded 
sequence is formed into the enzyme active site. The weak multiple U-A bonds let RNA 
polymerase finally escape from the DNA. 

Open the file “terminator.inp” and set the terminator name to “t_r’. The next four entries 
— which are related either to readthrough or some particular cases — can be set all to “n’’. 
“zeta” is the RNA polymerase-DNA dissociation rate constant, we can maintain the default 
value: “31.25” [4]. “eta” is the readthrough rate constant, we can ignore this parameter. 
Run the script “terminator_gen.pl’”. The file “t_r.mdl” is written. 


9.3.5 Pools 


As mentioned above, for the Repressilator we need three kinds of pools: the RNA 
polymerase pool, which is connected to every promoter and terminator, the ribosome pool 
that, in contrast, is wired to every RBS and CDS, and the transcription factor pools (three, 
as there are three repressors in the circuit) that link paired transcription units. Each kind 
of pools has its own perl script and input file. 


9.3.5.1 The RNA Polymerase Pool 

Open the “polpool.inp” file. “n_in” and “n_b” shall match the number of terminators and 
promoters in the circuit, respectively. Hence, both have to be set to “3”. “pol_free” is the 
average concentration of RNA polymerases free from any interaction with the DNA. We 
can keep the default value: “2.1d-6” [33]. Run the script “polpool_gen.pl” and the file 
“Polpool_3_3.mdl” is generated. 


9.3.5.2 The Ribosome Pool 

In the “ribpool.inp” file the entries “n_in” and “n_b” shall be set to the number of CDSes 
and RBSes in the circuit, respectively. Hence, both of them are equal to “3”, in our case. 
“rib_ free” is the mean concentration of ribosomes that are not bound to the mRNA. We 
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can stick to the default value: “4.2d-6” [33]. Run the script “ribpool_gen.pl’”: the file 
“Ribpool_3_3.mdl” is created. 


9.3.5.3 The Transcription Factor Pool 

Open the file “tfpool.inp”. Set “tf name” to “rep”. The “type” entry can be either “m” 
(monomer) or “d” (dimer). Let us set it to “m” since in the Repressilator model in Chap. 8 
we did not take into account any dimerization process. “n_in” and “n_b” are, respectively, 
the number of CDS encoding for and promoter regulated by a transcription factor. Both 
parameters shall be set to “1” since each repressor protein involved in the Repressilator is 
encoded in a single gene and binds a unique promoter. “k_d” is, as usual, the transcription 
factor decay rate constant. It shall be set to “0.00116” (the same value we used in the 
promoter and the CDS input files). “delta” and “epsilon” are the dimer formation and 
dissociation rate constant, respectively. Since we decided to treat our three repressors as 
monomers, we can ignore these two parameters. “tf_free” is the initial concentration of 
transcription factor in the pool. We can set it to “0”.7! Run the “tfpool_gen.pl’”. The file 
“rep_pool_1_1.mdl” is written. 


9.4 Circuit Design 


Once all necessary parts and pools have been generated, we can proceed with the visual 
design of the Repressilator circuit. The first step is to place promoter, RBS, CDS, and 
terminator into a module representing a transcription unit. Then, the whole circuit is drawn 
by wiring in silico each transcription unit to four pools: RNA polymerases, ribosomes, 
the repressor encoded by the CDS of the transcription unit, and the repressor that down- 
regulates the promoter of the transcription unit. 


9.4.1 Transcription Unit Design 


In the “ProMoT Browser’, click on “File, New, Dynamical Model...”. A window named 
“New Dynamical Model” pops us. As a “Name” for the “New Model” write “t_unit’, 
then click on “Create New Model”. The window “Visual Editor for t_unit” appears on the 
screen. We do not need to use it for the moment and you can close it. In the “ProMoT 
Browser”, “t_unit” has been placed under “modeling-entity, structural-modeling-entity, 
module, dynamic-model”. In order to assemble a transcription unit in silico, you shall 
load into ProMoT the MDL files corresponding to the four parts. Click on “File, Open, 
Model...” (or, more quickly, press “Ctrl” and “O” together on the keyboard). In the 


“Open” window, “Look in” should be set to the “repressilator” directory. Select “p_r.mdl” 


211m COPASL, we will change the initial concentration of one repressor to 10-8 M, as in Chap. 8. 
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Fig. 9.3. ProMoT visual editor. The icons corresponding to the four parts, which make a bacterial 
transcription unit, and the fake pool of environmental signals have been placed on the canvas. They 
are enclosed in red borders since all their terminals are open 


and click on “Open”. An item named “p_r” appears as a “module” in the “ProMoT 
Browser”. Load into ProMoT “sips_off.mdl’”, “rbs_r.mdl’, “rps_r.mdl’, and “‘t_r.mdl” as 
well. They all become modules. Double-click on “t_unit” (this is a “dynamic-model”). The 
“Visual Editor for t_unit” opens again. Click on “p_r’ and drag it to the panel “Modules” 
of the visual editor. There, a promoter icon with the name “p_r” appears. Drag the other 
three parts and the fake signal pool to the same panel. On the visual editor, click on “Edit, 
Set Module Size. ..”. Set “Module Width” to 900 and “Module Height” to 500. Drag and 
drop the four parts and the fake signal pool from the “Modules” panel to the canvas (the 
right part) of the visual editor (see Fig. 9.3). 

The four parts shall now be connected by wiring together corresponding terminals. 
Click on the terminal “out_pol” of the promoter “p_r” and draw a line up to the “in_pol” 
terminal of “rbs_r’. Release the mouse button and a link between the two terminals 
appears. In the same way, connect “out_cod” on “rbs_r” to “in_rbs” on “rps_r’, and 
“out_pol” on “rps_r’ to “in_pol” on “t_r’. Finally, link the “exc_sig_1” terminal on “p_r’” 
to “exc_1” on “sips_off”. The borders of the four parts are still red since each of them has, 
at least, an open terminal that will be connected to a pool later on (see Fig. 9.4). 

These terminals shall, first, become the terminals of “t_unit”. Click on “p_r” with the 
right button of the mouse and select “Connect terminals, Propagate “exc_tf_1””. A new 
terminal, also called “exc_tf_1”, appears on the upper side of the canvas. It is connected to 
the original terminal of “p_r’” (see Fig. 9.5). Repeat the same procedure with the other five 
open terminals. Notice that, once created, you can change the position of a terminal along 
the border of the transcription unit. In the end, the four parts are no longer enclosed in a 
red frame and a new module, corresponding to a transcription unit, has been designed (see 
Fig. 9.6). 
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Fig. 9.4 Connecting part terminals. Matching terminals belonging to different parts have been wired 
to each other. The terminals that remained open will become terminals of the transcription unit 
module 
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Fig. 9.5 A terminal for the transcription unit module. The promoter terminal “exc_tf_1” has been 
propagated to the transcription unit border. “exc_tf_1” will be then connected to a terminal of a 
repressor pool 


Click on “Model, Save”: “t_unit’”, as a “dynamic-model”, is now associated with four 
connected DNA parts. In the “ProMoT Browser” click with the right button of the mouse 
on “t_unit” then select “Change Property, Icon’. In the “Open” window (where “Look in” 
should be set to the “repressilator’” directory) select “protein.png” and click on “Open”. In 
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Fig. 9.6 Transcription unit module. Each part terminal has been either linked to another terminal or 
propagated to the “t_unit” border 


this way, the protein generator symbol is associated with the newly created transcription 
unit. Select, again, “t_unit” in the “ProMoT Browser” and click on “File, Save Selected 
Classes. ..”. In the “Save” window that pops up, “Save in” should be set to “repressilator’, 
whereas you shall write “t_unit” as “File Name”. Click on “Save”. The file “t_unit.mdl” 
has been created. 


9.4.2 The Repressilator Circuit 


After building the module that represents each of the three Repressilator transcription 
units, we shall load into ProMoT the MDL files encoding for the pools required by the 
Repressilator: “rep_pool_1_1.mdl”, “Polpool_3_3.mdl”, and “Ribpool_3_3.mdl”. Now, 
we can create a new “dynamic-model” for the whole circuit. As previously seen, click 
on “File, New, Dynamical Model...” and assign the “repressilator” name to the new 
dynamic model. The visual editor appears on the screen. Drag “t_unit” and the three 
pools on the “Modules” panel of the visual editor. Click “Edit, Set Module Size...”, and 
assign 900 to both the canvas width and height. Drag and drop for three times the “t_unit” 
module on the right panel of the visual editor. Once placed on the canvas they have been 
automatically named as “t_unit’, “t_unit0”, and “t_unit1”. Drag and drop on the canvas a 
“rep_pool_1_1” module. It contains two terminals (“in_1_tf” and “exc_1_tf”) that, how- 
ever, lie on top of each other. Double click on the “rep_pool_1_1” icon. A new visual editor 
(“Visual Editor for rep_pool_1_1”’) pops up: here you can rearrange the terminals in such a 
way that they are clearly visible. Click on “Model, Save’, then close the “Visual Editor for 
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Fig. 9.7. Modular circuit design. Three transcription units, which make the Repressilator core 
circuit, together with as many repressor pools and the pools of RNA polymerases and ribosomes 
have been displayed on the ProMoT visual editor 


rep_pool_1_1”. Place on the canvas two more “rep_pool_1_1”. The three repressor pools 
are named as: “rep_pool_1_1”, “rep_pool_!_10”, and “rep_pool_1_11”. Finally, put on 
the circuit “board” the RNA polymerase and ribosome pool. Also in this case, you have to 


rearrange their terminals (6 on each pool icon) in a convenient way. Transcription units and 
pools are signed by red borders since all terminals are open (see Fig. 9.7). Now, we shall 
link the terminals of the eight modules on the ProMoT canvas. “out_tf” on “t_unit” shall 
be connected to “in_1_tf” on “rep_pool_1_1” (i.e. in this wire the repressor molecules 
synthesized by “t_unit” flow into their pool). “exc_l_tf” on “rep_pool_1_1” shall be 
connected to “exc_tf_1” on “t_unitO” (from their pools, the repressor produced by “t_unit” 
move into “t_unitO” and bind the promoter therein). Similarly, “out_tf’ on “t_unitO0” 
is connected to “in_1 tf” on “rep_pool_!_10” and “exc_1_tf” on “rep_pool_1_10” is 
linked to “exc_tf_1” on “t_unit1”. Finally, “out_tf” on “t_unit1l” is wired to “in_1_tf” on 
“rep_pool_1_11” and “exc_1_tf” on “rep_pool_1_11” is joined to “exc_tf_1” on “t_unit” 
to close the ring configuration of the three transcription units. 

Each transcription unit has still to be connected with both the RNA polymerase and 
ribosome pool. As for “t_unit’, “exc_pol” and “out_pol” can be linked to “exc_1” and 
“in_1” on the RNA polymerase pool. In an analogous way, “exc_r” and “out_r” are bridged 
to “exc_1” and “in_1” on the ribosome pool. Then, we can wire the four open terminals 
of “t_unitO” to the terminals labelled with the number 2 on both RNA polymerase and 
ribosome pool, and the four open terminals of “t_unit!” to the remaining open terminals, 
which are associated with the number 3, on both pools. The circuit is now closed and 
the red borders have disappeared from each icon (see Fig. 9.8). In order to check that no 
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Fig. 9.8 The Repressilator as a closed synthetic gene circuit on the ProMoT visual editor 


mistake has been made in wiring the terminals, click on “Tools, Check For Consistency”. 
The new window “Incidence Matrix for repressilator” appears on the screen. If the value 
of “Number of Variables (columns)” coincides with that of “Number of Equations (rows)”’ 
(184 in this case), the circuit has been designed properly. Close the incidence matrix 
window and click on “Model, Save” before closing the visual editor. In the “ProMoT 
Browser’, select “repressilator”’, click “File, Save Selected Classes...”, and save the 
circuit into the file “repressilator.mdl”.77 

If we want to analyze and simulate the Repressilator with COPASI, we have to export 
the circuit model into the SBML format. After selecting “repressilator” on the “ProMoT 
Browser”, click on “File, Export, Export SBML...”. The window “Model Output of 
‘repressilator’ to SBML” pops up. Keep the “SBML File Name” as “repressilator.xml” 
and click on “Generate Model”. On the “Output Log” you should read that the final model 
has 50 equations overall. Click on “Close”. 

Since the model written in the MDL file “repressilator.md1” is fully based on ODEs, the 
file “repressilator.xml” encodes for ODEs as well. Hence, it can be used within COPASI 
only to carry out deterministic simulations. If you want to run stochastic simulations, 
you have to convert the “repressilator.xml” file into a new SMBL file that contains the 
circuit biochemical reactions. To this aim, click on “File, Run Script...” in the “ProMoT 
Browser”. As “Generation Directory” select the “repressilator” one i.e. the folder where 
the file “repressilator.xml” is placed. As “Script File”, choose “parser_lev2_linux.pl”?? 


221 is useful, for future circuit modifications, to save into a file all modules you used for designing 
the circuit. Click on “File, Save All Classes...” and name the file as “ALL_repressilator.md]”. 


23.4 version of this script for MS Windows is also available. 
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located in the same directory as the Part & Pools Perl scripts. Let the entry “Input File” 
empty, whereas write “repressilator” (without quotation marks) on the “Command Line 
Argument”. Click, finally, on “Generate Model” and the file “repressilator_lev2.xml” is 
created. 

After importing the “repressilator_lev2.xml” in COPASI, you will see that the Repres- 
silator model, generated by Parts & Pools, gathers overall 50 species” and 117 reactions. 
The cell volume is set to 1.610~!°1, and the three repressors have initial concentration 
equal to 0. Change “Rep_Pool_1!_1_Fm”’ to 10-°°M (as in Chap. 8). Both deterministic 
and stochastic simulations will show sustained oscillations (see Fig. 9.9). 


9.5 The Toggle Switch in the Parts & Pools Representation 


In order to design the core of the Toggle Switch we need — together with the RNA 
polymerase and the ribosome pools — two transcription units, two repressor pools, and 
two chemical pools. Both transcription units contain a regulated promoter. These two 
promoters are not identical but differ for the value of the transcription initiation rate 
constant, as in the model presented in Chap. 8. The other three parts, in contrast, can be 


iE you are using a __ terminal, type  “/usr/local/promot/kb/scripts/synthetic- 
biology/parser_lev2_linux.pl repressilator”’. 

25This indeed is the number of the equations calculated by ProMoT when the Repressilator model 
was exported into the SBML format. 
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modeled exactly in the same way inside the two transcription units. The two repressor and 
chemical pools do not show any difference either. Hence, overall we need to generate 5 
parts and 4 pools. 

As for the promoters, they are similar to those in the Repressilator circuit above since 
they contain two operators that are bound by a single repressor. Each repressor is active in 
its wild type configuration and gets inactivated upon binding an inducer chemical. In the 
“promoter.inp” file we can keep most of the parameter values we wrote for the promoters 
inside the Repressilator. For both promoters (whose name will be “p1” and “p2’’), we shall 
only set “lambdal” to “1d6”, “mul” to “0.01”, “n1” to “1”, “gammal” and “gamma2” to 
“1d6”. “k2” will be equal to “0.05” for “p1” and “0.03” for “p2”. 

The other three parts are identical to those in the Repressilator circuit. Let us just change 
their names to “rbs_ts”, “rps_ts’”, and “t_ts”. 

The transcription factor pools shall take into account dimerization. Open the file 
“tfpool.inp” and change “type” to “d” but keep “rep” as “tf_name’’. According to the model 
in Chap. 8, set “delta” to “Sd5” and “epsilon” to “2d-5”. 

We still need to generate a generic pool for the environmental signals. Open the file 
“signalpool.inp”. Since we suppose that the signals are not produced by other devices in 
the circuit, “input_flag” should be “n” and, as a consequence, “n_in” “O” (this number 
represents how many devices encode for the signal). “n_b” shall be equal to the number of 
promoters connected to the signal pool, “1” in our case; “signal_name” can be set to “sg”, 
for instance. Let us keep “sig_free” (i.e. the initial chemical concentration) to “0”: we will 
change it in COPASI. “k_s” is the signal production rate constant, we shall set it to “O”. 
“k_d” stands for the chemical decay rate constant. As in Chap. 8, we can set it to “3d-5”. 

Run the script “signalpool_gen.pl’, the file “sg_pool_1.mdl” is created. 

Finally, we shall instantiate the RNA polymerase and ribosome pool. In both cases, we 
have to set — in the corresponding input file — “n_in” and “n_b” to “2” (i.e. the number of 
transcription units in the Toggle Switch). 

After loading into ProMoT the MDL files of the five DNA parts just generated, you 
can design on the visual editor — by following the instructions given above — the two 
transcription units that form the core of the Toggle Switch. Let us call them “t_unit1” and 
“t_unit2”. They contain “pl” and “p2”, respectively. The only difference with respect to 
the transcription units in the Repressilator lies in the “exc_sig_1” terminal of the promoter 
that, now, has to be propagated to become a terminal of the whole transcription unit (see 
Fig. 9.10). 

The backbone of the Toggle Switch is realized by placing on the visual editor “t_unitl”, 
“t_unit2”, the repressor pools (named “rep_pool_1_1”, “rep_pool_1_10”’), the signal pools 
(“sg_pool_1” and “sg_pool_10”), the RNA polymerase pool (“polpool_2_2”), and the 
ribosome pool (“ribpool_2_2“ — see Fig. 9.11). 
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Fig. 9.10 Transcription unit for the Toggle Switch. Differently from the Repressilator, the promoter 
“exc_sig_1” terminal is also a transcription unit terminal 
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Fig. 9.11 Toggle Switch modules. Transcription units and pools that make the core of the Toggle 
Switch are placed, without wires, on the ProMoT visual editor 


As for the terminal wiring, let us connect “exc_1” on “sg _pool_1l” to “exc_sig_1” 
on “t_unitl” and “exc_1” on “sg_pool_10” to “exc_sig_1” on “t_unit2”.2° Now, we can 


26Tn this version of Parts & Pools the interactions between chemicals and transcription factors take 
place only inside promoter parts. 
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Fig. 9.12 The Toggle Switch designed with Parts & Pools on the ProMoT visual editor 


link “out_tf” on “t_unit1” to “in_1_tf’ on “rep_pool_1_1” and “out_tf” on “t_unit2” to 
“in_1_tf’ on “rep_pool_1_10”. Moreover, “exc_1_tf” on “rep_pool_1_1” shall be wired 
to “exc_tf_1” on “t_unit2”, whereas “exc_1_tf” on “rep_pool_1_10” shall be wired to 
“exc_tf_1” on “t_unit1”. Finally, the terminals labelled as “1” on both RNA polymerase 
and ribosome pool are joined to their counterparts on “t_unit1” (e.g. “exc_1” on the RNA 
polymerase pool is wired to “exc_pol” on “t_unit!”) and terminals labelled as “2” are to 
be linked to “t_unit2”. The Toggle Switch circuit is now complete (see Fig. 9.12). If you 
run the “Checking for Consistency” tool, you will get that both number of variables and 
equations equal 128. Finally, when generating the SBML file “toggle_switch.xml’, you 
will read on the “Output log” that “36 overall equations” are present in the model. 
Before running COPASI simulation, turn “toggle_switch.xml” into “toggle_switch_lev2.xml” 

“ win’). On the whole, the model 
contains 36 species and 84 reactions. The results from deterministic simulations are 
shown in Fig. 9.13. They exhibit the same trend we have seen in Chap. 8 but the amount 
of molecules at the different steady states have changed. This is probably due to the fact 


by running the script “parser_lev2_linux.pl” (or 


that the model generated with Parts & Pools takes into account more interactions where 
chemicals are involved. They clearly play a major role in determining the circuit equilibria. 


Take Home Message 

e Parts & Pools, as an add-on of ProMoT, is a computational tool for the modular, “drag 
and drop” design of synthetic gene circuits. 

¢ Models for DNA parts and pools are generated independently of each other by running 
Perl scripts. A user can only modify the values of kinetic or structural parameters. 
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Fig. 9.13 Toggle switch simulation results. Each protein number was measured at intervals of 
50,000 seconds as in the BioNetGen model simulations in Chap. 8 


¢ Upon wiring circuit components together, a mathematical model for the whole genetic 
network arises from the composition of the models associated with the parts and pools 
present in the circuit. 

e A synthetic gene circuit and its component are encoded, by default, into MDL files, the 
language of ProMoT. MDL files can be exported into SBML format for circuit analysis 
and simulations with other software, such as COPASI. 


Check for 
updates 


What You Will Learn in This Chapter 

Models for synthetic gene circuits necessitate the knowledge of kinetic parameter values. 
However, only a fraction of them can be measured in the lab. The others are obtained 
by fitting the model to experimental data. This task is, in general, non-trivial and 
might demand to use so-called stochastic algorithms, whose theoretical foundations are 
discussed in this chapter. Beside that, you will see how to exploit the difference in reaction 
rates in order to carry out model reduction. This procedure permits to restrict the number 
of kinetic parameters and ordinary differential equations necessary to compute the circuit 
dynamics. Circuit robustness and sensitivity analysis are explained in this chapter as 
well. Sensitivity analysis, in particular, is a technique that permit to understand if species 
concentrations are affected by changes in parameter values. With this knowledge one 
can figure out which reactions (and corresponding rate constants) play a major role in 
determining the working of a synthetic gene circuit. 


10.1 Model Fitting 


Fitting a model to experimental data is a commonly used technique to estimate the values 
of the kinetic parameters involved in the model. A typical procedure is the least squares’ 
method that tries to shorten the distance between the model output and the data by 
minimizing the so-called Sum of Squared Residuals (SSR). To explain this method, let 
us recall the empirical Hill function (phenomenological model) we met in Chap. 2 


Finax — Fini 
F a Fynin 4 max 7 ; (10.1) 
1 IC50" 
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Fig. 10.1 Fitting the model to t good fitting 
the data, graphical 
representation. Values of F are 
measured at the system steady 
state 


where F represents fluorescence,! § the input signal concentration, [C50 — the half 
maximal inhibitory concentration — is analogous to the Hill constant Ky, and n is the 
Hill cooperativity coefficient. After linearization, Eq. (10.1) becomes 


log(F) = nlog(S) — nlog(/C50) = f(S,n, 1C50), (10.2) 
where 
F, —F 
Fe (10.3) 
F- Finin 


The goal of the least squares’ method is to find those values of n and [C50 that make 
Eq. (10.2) approximate best the experimental results (see Fig. 10.1). 

Experiments return a fluorescent level F; for each input concentration S;, where 
i = 1,...,N. The square distance (or residual) between the ith experimental point 
(S;,og(¥;)) and the corresponding theoretical point (S;, f(S;,n, 1C50)) is given by 
(log(Fi) — f(Si,n, IC50))*. Hence, the sum of the square residuals — s(n, TC50) — 
between all data points and their corresponding model representation is 


N 
s(n, 1C50) = ) “[log( Fi) — f (n, Si, 1C50)P. (10.4) 


i=l 


The SSR in Eq. (10.4) is, in principle, greater than or equal to zero. It equals zero 
only when all data points lie on the model line f(S,n, 1C50) i.e. never in practice. By 
minimizing s(n, IC50) with respect to both n and [C50 we obtain those values of n and 
IC50 for which f(S,n, [C50) is the best approximation of the empirical data. 


1 Fax and Fiyjn are the measured maximal and minimal fluorescence level, respectively. 
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Fig. 10.2 A complex r 
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In other words, whenever a system obeys to Eq.(10.1), the least squares’ method 
corresponds to find the values n = n* and IC50 = IC50* such that s(n, 7C50) is 
minimal i.e. 


s(n*, 1C50*) = min [s(n, TC50)]. (10.5) 
a, 


The minimization problem in Eq. (10.5) is a so-called optimization problem. In any 
optimization problem a function s(m) is termed objective function* and 1 is the set of 
model parameters.* 

Finding the minimum of s(z) corresponds to find out a set of parameter values 2* such 
that the model function f(2x*, 0) (where o is the input vector*) is as close as possible to 
its target, the experimental data. 

In general, s(a) can have a complex shape with a global optimum (minimum) and 
several local optima as in Fig. 10.2. 

Algorithms try to find an optimum of s(z) by evaluating the objective function (and, 
sometimes, its first and second derivatives) in several points of the parameter space until 
the difference between f (2,0) and the experimental data stays lower than a pre-fixed 
small quantity €, i.e. s(a@) < € (convergence condition). 


10.1.1 Local Optimization 


Local optimization aims at finding an optimum of the objective function in the vicinity 
of a given point (;) in the parameter space. The gradient of s(a) evaluated at 2; 


25 (z) is areal, twice differentiable function. 
3In Eq. (10.5), s(n, 1CSO) is the objective function and x = (n, IC50). 


4Tn our example, we have a single input, i.e. the chemical concentration S. 
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(Vs(x)|;) gives the direction along which s(x) grows more rapidly by moving away 
from ;.° Therefore, —Vs(x)|; represents the steepest descent direction. A small step 
in this direction permits to reach a point mj; such that s(a;) is lower than s(x;). More 
formally, for a small-enough coefficient w we have that s(a; — wVs(z)|;) < s(2;). This 
is the main idea behind the gradient descent method to find a local optimum of s (sr). The 
gradient descent algorithm requires an initial point m9, a maximum number of iterations 
Nmax, and an error € to evaluate convergence. At each iteration, the algorithm moves 
from 2; to mj+1 according the formula: rj+; = 2; — aV(x)|;. The algorithm iterates 
until either convergence or Nyx is reached. The step size a is not a fixed value and can 
be determined at each algorithm iteration under the condition that s(2j;+1) < s(mj). 


10.1.2 Global Optimization 


Kinetic parameters are, generally, real numbers. If a model contains N parameters, the 
corresponding parameter space is the R” hyperplane. This continuous space can be 
converted into a grid of dimension N and granularity M after limiting to M the quantity 
of possible values that each parameter can take. In this way, the parameter space contains 
a discrete number of points equal to M%. The global minimum of an objective function 
is found, in principle, after evaluating the objective function at each point of the grid.° 
However, for large values of M and N, an exhaustive search of the global optimum is 
an intractable problem. Therefore, one makes use of a stochastic algorithm that performs 
a scan of the parameter space through randomized jumps. This method, when carried out 
properly, allows to find the objective function global minimum within a reasonable amount 
of time, after spanning only a portion of the parameter space. However, if the search is not 
done accurately, a stochastic algorithm can get stuck into a local optimum. 

In order to deal with optimization problems associated with complex biological systems 
two kinds of stochastic algorithms are often used: the Simulated Annealing and the Genetic 
Algorithms. 


10.1.2.1 Simulated Annealing (SA) 


This algorithm takes inspiration from physics, in particular from magnetic systems. They 
_ Ej) 
lie in a state 2; with probability p(2;) proportional toe *7 , where E(z;) is the system 


energy, kg the Boltzmann constant,’ and T the system temperature. In an optimization 
problem, the energy becomes the objective function (i.e. E(aj;) = s(m;)) and kp is 


5The gradient of a generic function f(x,y,z) is defined as V/f(x, y,z) = 
(2Gy2) Of (yoo) BF )-2) ) 
x ? dy 2 OZ 7 


©More precisely, the higher the granularity of the grid, the larger the probability of finding the global 
minimum of the objective function. 


Tkp = 1.381 10~733/K. 
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set to 1. T is the parameter to vary at each algorithm iteration. At the beginning of the 
simulation, the system/problem is at high energy and elevated temperature. At every step, 
the temperature is lowered by a fixed quantity 57. When T = 0, the system freezes into 
the global minimum if the cooling process was slow enough.® Following the Metropolis- 
Hastings algorithm, the simulated annealing procedure requires, at each iteration, the 
following operations 


* given the sistem in the state 2;, chose a neighbour state 2, either via gradient descent 
or through a small, random modification of 2; coordinates, 
* evaluate s(x): 


1. if s(t) < s(mj), Set Hi+1 = Hp (ie. the system moves from wz; to Hj41 = Hp), 


§(1n)—S(m;) 


2. if s(,) > s(x;), calculate the probability p = e T and pick a random 
number r € (0,1). If p > r, set wj41 = Xp Gump), otherwise set w;+1 = mw; (no 
move), 


¢ lower T by the quantity 57. 


If 5T has been chosen small enough, at JT = 0 the system is likely to be found into its 
global optimum. Otherwise, the system would probably get stuck into a local minimum of 
s(a). The random acceptance of states r, where s(z,) > s(s;) helps the system escape 
from a local optimum. This is highly probable for large values of T, whereas it gets more 
difficult as soon as T (and therefore p) approaches 0. 


10.1.2.2 Genetic Algorithms 

Genetic Algorithms are derived from biology since they try to mimic mutations and 
selection as they take place during the evolution process. Differently from SA, which 
finds a single solution, Genetic Algorithms produce a population of solutions (individuals), 
each characterized by its own genome i.e. a set of parameter values. At every generation, 
the fitness of each individual is evaluated. In our case, the fitness is represented by the 
objective function s(z). Individuals are ranked according to their fitness and only the best 
N individuals (a pre-fixed amount) are selected for the next steps of the algorithm. All the 
others individuals are eliminated from the population. Each of the selected individual is 
paired, randomly, with another one. They have offsprings (two for instance) whose genome 
arises from the combination of the parent genomes by cross-over. Mutations are here 
allowed. These offsprings form the next generation of individuals. Generally, the average 
fitness improves from generation to generation since the new individuals come from the 


8A magnetic system is disordered at high temperature due to the molecule thermal agitation. If the 
system is frozen too quickly, the molecules do not have a time sufficient to settle into the optimal 
system configuration i.e. the one that minimizes the system energy. 
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best ones in the previous generation. Once a new generation arise, the first N individuals 
in the fitness ranking are kept and the whole procedure is repeated. The algorithm stops 
when convergence, maximum number of iterations, or some minimal criteria for a good 
solution are met (this depends on the algorithm implementation). Genetic Algorithms are 
more computationally demanding than SA due to the presence of a (potentially large) set 
of solutions. 


10.1.3 Problems: Overfitting and Identifiability 


A possible, undesired consequence of fitting a model to a single data set is that the 
computed values of the kinetic parameters fail to reproduce a different data set from 
new measurements on the same system. This is the so-called overfitting problem and has, 
among its causes, the large uncertainty (errors) associated with biological data. Hence, 
parameter values should be determined after testing a model against several data sets via 
methods from bioinformatics, such as bootstrapping [29]. 

The identifiability problem is of different nature and arises when, given a model, 
some parameter values cannot be determined in an unambiguous way. Structural non- 
identifiability occurs when two parameters are present in a model only as a product or 
ratio. For instance, an algebraic relation such as Keg = ki /k_, is generally used to replace 
two rate constants that are difficult to measure (k_; and k;) with an equilibrium constant 
(Keq) that is quantified more easily in the lab. However, this prevents an exact calculation 
of both k_; and ky since the relation kj = Ke,k_, admits infinite solutions. Practical 
non-identifiability is different and it is due to a lack of data. This problem emerges when 
the number of data is lower than the number of parameters. For instance, how to determine 
the values of n and 7C50 in Eq. (10.2) if only a single data point were available? 


10.2 Model Reduction 


What can we do in order to reduce model complexity? How can we eliminate from a model 
both the kinetic parameters whose values are unknown and the ODEs that are difficult 
to solve, even numerically? As we have seen in Chap. 2, conserved quantities permit to 
simplify a model by replacing ODEs with algebraic equations. This is of great help when 
one can get rid of stiff (i.e. unstable) ODEs, which cannot be integrated easily. Beside that, 
reaction time scale is taken into account in order to carry out model simplification. In a 
transcriptional network, for instance, protein binding to and unbinding from a promoter are 
fast reactions that take just a fraction of a second; mRNA and protein synthesis, in contrast, 
needs several minutes; protein decay is even slower and can take up to some hours (this is 
the case of fluorescent proteins). Fast reactions are often accompanied by stiff ODEs that 
force the numerical solver to use very short integration time steps. However, if a reversible 
reaction proceeds at fast speed in both directions, the ratio between the concentrations of 
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products and reactants will be always very close to the reaction equilibrium constant.? Let 
us consider the scheme in Fig. 10.3a. If we suppose that 


ky, k-1 > kj, Ro, (10.6) 


i.e. 5; is produced and degraded slowly but the conversion from s1 to sz and vice versa takes 
place at much higher pace, we can approximate s7 dynamics with the algebraic relation 


2 = Keg > (10.7) 
where 
ky 82 
Keg = — = —. 10.8 
tan ies (10.8) 


Hence, the dynamics of the system in Fig. 10.3a is described by one ordinary differential 
equation and one algebraic relation 


ds = je 
dé i oS] 
62 = Kegs. (10.9) 


This is the so-called quasi-equilibrium condition that allows us to replace a (potentially) 
stiff ODE (here corresponding to the time derivative of s2) with an algebraic formula. 
Moreover, as we said above, Keg is easier to measure than k; and k_, separately. Overall, 
the model simplification is double: an algebraic relation instead of an ODE, and an 
equilibrium constant in place of two rate constants. 

Slow reactions do not modify species concentration considerably. If a species takes 
part in a slow and a much faster reaction, the latter will play a major role in determining 
the species steady state. Let us have a look at scheme in Fig. 10.3b. If s; is produced at 
low rate (ko) and transformed into sz quickly (kz >> ko), it is reasonable to think that s1 
concentration will not change dramatically over time and stay approximately constant to a 
rather low value (51) determined by 


= ds ko 


0= —=ky—k — 10.10 
it 0 — k281 => Ste E ( ) 


(ki ,k-1) 
°Let us consider the generic reversible reactionaA+bB = cC+dD. At the equilibrium, the 


rates of the forward ws = k, A“ B? ) and the reverse (v’ = k_,C° D¢) reactions are balanced such 
that the reaction equilibrium constant, Keg = kj /k—1, corresponds to the ratio C° pé / AtB?, 
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The dynamics of the system in Fig. 10.3b is given by the solution of the following 
equations 


ko 
3y=— 
1 a 
ds? 
To hast = ho. (10.11) 


Equation (10.10) is know as quasi-steady-state condition and permits to replace an 
ODE with an algebraic equation, as in Eq. (10.11) above. Here, in particular, a species 
concentration has been turned into a constant.!° 

As an example of a system where both the quasi-equilibrium and the quasi-steady- 
state condition are used to achieve model reduction [29], let us consider a gene where 
the synthesis of a protein P is controlled by a riboswitch. The wild-type riboswitch is 
off (i.e. it prevents ribosome binding) and gets activated (on) by a chemical 7. When the 
riboswitch is turned on, ribosomes have access to the mRNA and translation starts. For 
the sake of simplicity, we consider both the concentration of the inducer chemicals (/) 
and the ribosomes (RJ B) as constant.!! Let us assume that the binding of J to mF (the 
mRNA with an inactive riboswitch) takes place with rate constant A and is irreversible. 
Moreover, mRNA degradation is not explicitly considered but included into the process of 
protein synthesis (with rate constant kz). The complete schematic of our system is given in 
Fig. 10.4a. Here, kg is m°/F transcription rate constant, m” represents the complex made 
of ribosomes bound to m°” (i.e. the mRNA hosting an active riboswitch), k; and k_ are 


10We already used this condition in Chap. 2 in order to derive the Michaelis-Menten kinetics. 


11T a simulation, they should be set to a much higher value than the concentration of the other 
species in the system. 
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Fig. 10.4 Synthesis of protein P from an mRNA hosting a riboswitch. (a) A blue box denotes the 
portion of the system where the quasi-steady-state condition can be applied. (b) A red box delimits 
the species and the reactions interested by the quasi-equilibrium condition. (c) Reduced system 


the ribosome-m°” binding and the ribosome-m" dissociation rate constants, whereas kg is 
P decay rate constant. The ODEs associated with our system are 


dmtf 

— = ky — Almeff 

d on 
— = .Im¢f — ky RIBm™ + k_ym" 

d r 

—— = kyRIBm™ — (k_1 + kp)m" 
dP . 
a kom" — kaP (10.12) 


The initial concentration of m°//,m°", m", and P is zero. As for the kinetic parameters, 
let us sett A = ky = 10°M-—!s~! and kj, = 10° sl whereas ky = 0.5Ms—!, kn = 
0.1s—!, and kq = 0.00116s~! (ie. the protein P half-life is of about 10 min). With this 
choice of parameter values, we have that the inducer-mRNA binding and the ribosome- 
mRNA interactions are much faster than the other reactions in the system. Since 4 > ko, 
we can expect that m°/7 settles rapidly to a very low concentration. Therefore, by using 
the quasi-steady-state condition we have that 


molt — *o 


10.1 
vk (10.13) 
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By using the result in Eq. (10.13), the ODEs in Eq. (10.12) are rewritten as 


mot — 
ul 
d on 
— = ky —kyRIBm™" +k_\m" 
dm 
= = ky RIBm™ — (k1 + ky)m’ 
dP : 
i kom!’ —kaP. (10.14) 


The three ODEs in Eq. (10.14) can be associated with a reduced system where the 
species m°/F does not play any particular role (see Fig. 10.4b). 

Let us have a look at Fig. 10.4b. The subsystem made of m°” and m” is involved in 
four reactions. Two reactions (rate constants: k; and k_,) are much faster than the other 
interactions in the subsystem (rate constants: kg and kz). Therefore, we can apply the 
quasi-equilibrium condition to the (m°", m") subsystem. This implies that 


ky RI Bm” = k_ym" 


m = Kean 


(10.15) 
where Keg = Se The subsystem made of m’ and m°” can be described by a new 
variable m°" = m°” + m’ that lets us redraw our system as in Fig. 10.4c. 

Since we have that 
K 
m' = —“1_m”, (10.16) 
1+ Keg 
the time derivative of m°’ corresponds to 
d or d on r d on d r k K. 
ED Oe pai ee ea 
dt dt dt dt 1+ Keg 


and Eq. (10.14) can be rewritten as 


mott 
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dm® = Keg m” 
dt 1+ Keg 
P koK 
OF Rn kaP . (10.18) 
dt 1+ Keg 


On the whole, we have reduced the initial system of four ODEs in Eq. (10.12) to a 
system of two ODEs and two algebraic relations. The species m°/T is merely a constant 
and, in order to solve Eq. (10.18), it is not necessary to know ki, k_1, and RIB explicitly 
but we need the value of Keg only. Therefore, we replaced 3 parameters with a single one 
that, being an equilibrium constant, is easier to measure. 


10.3 Sensitivity Analysis 


Let us suppose to study, in the wet lab, how the output of a genetic circuit (e.g. 
fluorescence) changes with respect to modifications in the circuit components. For each 
experiment we make a single point mutation along the sequence of one DNA part (a 
promoter, an RBS, or a terminator) and then measure the circuit fluorescence level at 
steady state. We will see that only some mutations provoke a remarkable change in the 
fluorescence emitted by the circuit, whereas some others do not affect the circuit output at 
all. Given a model for our circuit, each mutation corresponds, in principle, to a variation in 
the value of one kinetic parameter. Hence, our results point out that the circuit fluorescence 
level is sensitive to variations in some parameters, whereas is robust to changes in others. 

Sensitivity analysis is a technique that permits to determine in silico which parameters 
contribute most to the circuit performance. This piece of information can be used to modify 
the circuit in vitro — for instance, as above, with point mutations — in order to get the desired 
output in vivo. There are two kinds of sensitivity analysis, namely 


* local sensitivity analysis: it estimates how species concentrations!* change — usually 
at steady state — as a consequence of small perturbations in the kinetic parameters. If 
a model contains N species and M kinetic parameters, the sensitivity coefficient cj; 1s 
defined as the first derivative of the concentration of the species s;, i = 1,..., N with 
respect to the model parameter 7;, j = 1,..., M ata point m* = (zr, ..., 7 ,) in the 
parameter space 


OS; i‘ si (a7 + d30j) — 8; (707) (10.19) 
— — im : . . 
On j x* 62 j;—0 67; 


Cij 


!2 Species concentrations are, usually, the output of a circuit model. 
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The accuracy of cj; depends entirely on the choice of dz ad 

¢ global sensitivity analysis: it estimates changes in species concentrations by perturbing 
parameter values over large ranges. Sensitivity coefficients have not a unique expres- 
sion as in Eq.(10.19) but depend on the method chosen to carry out the analysis 
(see [48] for a review). Variance-based methods, where the sensitivity coefficients are 
calculated on the variance of species concentrations, proved to be useful for Synthetic 
Biology applications [17].!4 


Sensitivity analysis is useful also to understand the relevance of stochastic effects in a 
synthetic gene circuit. Local sensitivity analysis is generally sufficient to see how robust 
species concentrations are, at steady state, with respect to small changes in the parameter 
values that account for intrinsic stochastic variations. Global sensitivity analysis can be 
useful when some cellular mechanisms might modify the circuit dynamics significantly 
(extrinsic noise) or if considerable fluctuations in the concentration of one or more of 
the circuit species, which can highly perturb some parameter values, are expected (for 
instance, relevant changes in the amount of a repressor would modify drastically the 
transcription initiation rate of its target promoter). 


10.4 Circuit Robustness 


In a very general way, robustness can be seen as the capacity of a system to continue 
working properly in spite of some internal or external changes. To be more precise, one 
should define the property of a system that shows robustness with respect to a particular 
kind of variations. Given a synthetic gene circuit, one is mainly interested in determining 
if the circuit output, which quantifies the circuit performance, is robust against changes in 
parameter values and species initial concentrations. This type of robustness is referred to as 
canalization. In silico methods to study circuit canalization are very similar to sensitivity 
analysis. Indeed, the most straightforward way to assess output robustness with respect to 
changes in the kinetic parameter values is to compute the circuit output — at steady state — 
in a region of the parameter space around the circuit operating point. 


'3The derivative in Eq. (10.19) can be solved via a finite difference approximation method. Here s; 
is evaluated, along each direction z;, on the points of a uniform grid (i.e. the distance between two 
adjacent points is constant and equal to dz ;). 

'4Tn the work by Feng et al. [17], the variance-based method called Random Sampling — High 
Dimensional Model Representation (RS-HDMR) algorithm is applied to inverter circuits (i.e. 
NOT gates). The computed total sensitivity coefficients are evaluated over 10,000 sets of kinetic 
parameters. Variations in the parameter values go from two up to four order of magnitudes. The 
accurate results provided by the RS-HDMR algorithm are, however, due also to the fact that 
inverters do not show any particular patterns such as oscillations, feedbacks, or multistability. Global 
sensitivity analysis on more complex synthetic gene networks would require, probably, different 
approaches. 
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A different kind of robustness is called fault tolerance and represents the capacity 
of a system to go on working upon breakdown or deletion of one or more of its 
components. Fault tolerance is common in natural biological networks as a consequence 
of the redundant strategies often adopted by evolution. 

Finally, a third type of robustness, termed homeostasis, corresponds to the circuit 
capability to stabilize a quantity (e.g. the fluorescence level) against deviations from a 
given value (the one at the steady state, for instance). As we will see below, homeostasis 
is achieved via a structural motif: the negative feedback loop. 


10.4.1 Negative Feedback: Autoregulation 


A negative feedback is realized when the output of a circuit downregulates its own 
production. Becskei and Serrano [9] constructed in FE. coli a negative feedback loop 
made of a single transcription unit as shown in Fig. 10.5a. The constitutive 7 phage 
promoter, PA’, was modified with the insertion of two tet operators (tetOp2) along the 
—35...— 10 region. This synthetic promoter led the synthesis of the fusion protein 
TetR-EGFP (enhanced green fluorescent protein). EGFP is responsible for the system 
read-out, whereas TetR binds tetOp2 and downregulates Pi. In order to show that the 
feedback loop confers stability to fluorescence at steady state, the negatively autoregulated 
transcription unit was compared to an unregulated circuit where Pd had no modifications 
(see Fig. 10.5b). 

Experimentally, Becskei and Serrano quantified the variability in the TetR-EGFP 
expression by calculating the coefficient of variation CV = on (see Chap. 4) 
over two different cell populations, one containing the negative feedback loop, the other 


'SGF stands for green fluorescence. 
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the unregulated system. They obtained CV < 10% for the autoregulated circuit and 
CV * 20% for the unregulated one.'° This experimental result can be explained by 
analyzing the two systems in the proximity of their steady state. If we denote with R 
the chimeric protein TetR-EGFP, the dynamics of R in the two circuits is given by the 
following ODEs 


dR 
oe =k, —kgR= f,(R), unregulated system 
dR 1 
= ky kaR = fa(R), autoregulated system (10.20) 
dt i+ 


Ki 
where ky is the translation rate constant, kg the decay rate constant, and Ky the Hill 
constant.!’ In the unregulated system, f,,(R) admits the solution 


R(t) = ma — eT hil) + RO)eW Mat (10.21) 


where the ratio = is the value of R at steady state (R}° — see the monomolecular 
system in Chap.5). From Eq. (10.21) we can calculate the system response time, t Lip 
that corresponds to the time the unregulated system needs to reach half of TetR-EGFP 
steady state concentration (R** /2). Intuitively, the response time tells us also how quickly 
the circuit recovers from a perturbation at the equilibrium. If we set R(O) = 0, we can 
write that 


ks Ks —katy 


= —(l- aH 
Sy ae Fea ) 

=) 10.22 
ty ar (10.22) 


As for the autoregulated circuit, we shall linearize f,(R) in a neighbourhood of its 
steady state, R**. Hence, we have 


Ofa(R) 


fa(R) = fa(Ry’) + aR 


ka | 6R (10.23) 


ss ~ k ss \2 
Ri s (1 + i) 
where 6R = R— R*’. The dynamics of R follows from the solution of the following ODE 

dR 


= (—a — kg)(R — R3) (10.24) 


!6The fluorescence distributions over the two cell populations were rather different. In particular, 
the one corresponding to the cells containing the negative feedback loop showed a narrower peak 
around the mean value. 


!7TetR does not show cooperativity effects. 
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where a = = Sa If now we set b = a + kg, Eq. (10.24) becomes 
(14%) 
dR 
a —bR+bR*, (10.25) 


whose solution leads to express 71, the response time of the autoregulated circuit, as 
2 


(10.26) 


Hence, we have that #1, < ft1,. This means that the autoregulated circuit recovers 
2 2 


more quickly than the unregulated circuit from small perturbations at steady state.!* In 
other words, the negative feedback loop confers robustness against small changes in the 
concentrations of TetR-EGFP at the circuit equilibrium. 


10.4.2 Understanding Robustness: Bacterial Chemotaxis 


To get a better understanding of the concept of robustness, we can have a look at bacterial 
chemotaxis i.e. the mechanism used by bacteria to sense a chemical in the environment 
and move along its gradient. This process has been studied deeply in E. coli [3,6]. 

In a liquid medium, E. coli moves by altering periods of runs along a constant direction 
(on average they last about 1s) and tumbles during which it stops and changes direction 
randomly (average duration: ~0.1s). Overall, E. coli motion resembles a random walk 
(see Chap.4). When EF. coli senses an attractant, it reduces the tumbling frequency to 
have longer runs towards the region where the concentration of the attractant is higher. 
In contrast, after sensing a repellent, E. coli increases the tumbling frequency such that the 
probability of moving away from the repellent increases as well. 

From what we said above, a reasonable value for the steady state tumbling frequency 
is about 1 s~!. When a chemical is sensed, the tumbling frequency is modified. However, 
after a certain time interval (seconds up to minutes, depending on the change in frequency) 
it returns to its steady state value, 1s~! again. Therefore, the steady state tumbling 
frequency does not depend on the level of any sensed chemicals. This is an example 
of the so-called exact adaptation.'? Exact adaptation allows cells to respond to multiple 
changes in the concentration of one or more chemicals. How is exact adaptation achieved? 
A simplified version of the chemotaxis signalling network is schematized in Fig. 10.6. 


'8Tn [9] the stability of the two circuits is quantified with the expression oF) Live that corresponds 


to the denominator of the response time. 


19 adaptation demands only that the cells, after a perturbation, are able to settle to a new steady state. 
In this case, it would be characterized by a tumbling frequency different from 1 svi. 
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Fig. 10.6 Simplified ligands 
chemotaxis signalling pathway Le oe 
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CheY” 


flagella motors 


The receptor complex E — and its methylated configuration, E,, — can be in two states: 
active and inactive. When active, the receptor can phosphorylate CheY (into CheY?). 
CheY? acts on the flagella motors and increases the tumbling frequency. When an 
attractant binds the receptor, its probability to be active is decreased. As a consequence, 
the cell tumbling frequency is decreased as well. Repellent binding, in contrast, increases 
the receptor probability of being active. 

Receptor E is methylated by CheR and Ej, has higher probability than E of being 
active. For the sake of simplicity, we will assume that only £,,, can be in the active state 
(as depicted in Fig. 10.6). Active E, is demethylated by CheB”. CheB is phosphorylated 
by active E,,. If we call activity (A;) the concentration of active E;,, we see that Ey, 
autoregulates its activity with a negative feedback loop through the system CheB/CheB?. 
This negative feedback loop is what determines adaptation since it contrasts the receptor 
methylation due to CheR. We assume that CheR acts on E according to a Michaelis- 
Menten scheme. Since CheR <_ E, CheR works at saturation i.e. independently 
of E concentration (see Chap.2). The methylation reaction (“R”) gives a flux ve = 
k&C heRr~. where CheRr is CheR total concentration and kk the rate of E methylation. 
CheB? demethylates E,, following a Michaelis-Menten scheme as well. Hence, we can 
write that 

p (kP kB) é kB ‘i 
CheB’ +Ae = [CheB* A,.] —> CheB’ +E. (10.27) 


20The methylation reaction proceeds at maximal speed vR =y MAXR: 
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From the quasi-steady-state condition we have that 


P CheBF A. 
[CheB* A,.] = ————_., (10.28) 
Ku =F Ac 
Pp: : P21 KB +k? Be ie tai 
where CheB,, is the total concentration of CheB and Ky = ip IS the Michelis- 
1 


Menten constant. The flux of the demethylation reaction (“B”) is given by v2 = 
kBiCheB? Ac] = UM AX AK where Was = k8CheBP. Em dynamics obeys the 
following ODE 


dEn R B Ac 
= : 10.29 
di UmAX — UMAX A.+Knu ( ) 


Thus, at steady state the receptor activity is given by 
yk 
AS = Ky——“**_—. (10.30) 


Equation (10.30) shows that the receptor activity at steady state (and, therefore, the steady 
state tumbling frequency) does not depend on the ligand concentration i.e. it is robust 
to changes in the amount of the ligand in the environment. This is the condition for a 
system to show exact adaptation. If an attractant binds EF, the receptor activity decreases. 
Hence, the demethylation action of C heB? onthe receptor decreases too. Che R, however, 
goes on methylating E. As a consequence A, increases again, CheB? concentration 
gets higher and balances the action of CheR on E. If a repellent binds the receptor, A; 
increases. However, the demethylation of the active Em by CheB? becomes higher too 
and brings A; back to its steady state (see Fig. 10.7). CheR working at saturation and the 
negative feedback loop through which A; regulates itself via Che B” are the mechanisms 
responsible for exact adaptation in chemotaxis. This feature is robust to changes in every 
kinetic parameter of the chemotaxis pathway model since it is a structural property of the 
system. A;, in contrast, is not a robust quantity generally speaking” since it depends on 
Ky — hence on the choice of parameter values — and on the total concentration of Che BP 
and CheR, as shown in Eq. (10.30). Quantities such as A-, whose amount depends on the 
values of kinetic parameters, are referred to as fine-tuned. 


1CheBh = CheB? + [CheB? Ac]. 

22 Alon et al. [3] showed, with experimental data, that A%* varies over a cell population since the 
kinetic parameter values change slightly from cell to cell. Nevertheless, exact adaptation is conserved 
over the whole bacterial population. 
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Fig. 10.7 Variation of E. coli activity due to changes in repellent ad attractant concentration [6] 


Take Home Message 


Fitting a model to experimental data is an optimization problem. Finding a solution 
might require stochastic algorithms such as the Simulated Annealing or a Genetic 
Algorithm. 

Stochastic algorithms can get stuck into a local optimum when the objective function 
of an optimization problem is highly complex. 

Some parameter values are not identifiable due to a lack of data or information in the 
model. 

Differences in the speed of bio-chemical reactions allow model reduction. Quasi- 
equilibrium and quasi-steady-state condition permit to get rid of parameter values hard 
to measure and ODEs difficult to solve. 

Local sensitivity analysis quantifies the effect of small variations in the kinetic 
parameter values on species concentrations (usually) at steady state. 

Homeostasis is achieved via a negative feedback loop. 


Check for 
updates 


What You Will Learn in This Chapter 

In this Chapter you will learn how to carry out, with COPASI, some of the analysis 
techniques described in Chap. 10. Initially, we will use a new model for the Toggle Switch, 
which takes into account promoter leakage, to run sensitivity analysis and show that the 
circuit is not robust against variations in the value of the leakage translation rate constant. 
We will then use the same model to see how to run a circuit optimization with a stochastic 
algorithm (Simulated Annealing). This procedure is used to find new values for some 
selected kinetic parameters such that the circuit performance is improved. As in every 
optimization problem, you will have to specify a proper objective function. Finally, on 
a simpler circuit, where a reporter protein is expressed in the presence of an inducer 
chemical, you will see how to fit a model to two different data sets in order to estimate 
the Hill constant and coefficient. 


11.1 Sensitivity Analysis 


Let us consider the model for the Toggle Switch in Eq. (7.1). This model is based on mass 
action kinetics and neglects the presence of inducers in the circuit. By inserting two new 
reactions, which account for the basal production of the repressors R; and R2, the model 
becomes 


Ri + pi —> pi 
; oH 
Pi —> Rit pi 


k 
pi —> pt t+Ro 


© Springer Nature Singapore Pte Ltd. 2018 149 
M. A. Marchisio, Introduction in Synthetic Biology, Learning Materials in Biosciences, 
https://doi.org/10.1007/978-98 1- 10-8752-3_11 


150 11 Circuit Analysis with COPASI 


Table 11.1 Toggle Switch Parameter | Value 
parameters a 9 
P{ (0) 10 
p ©) 0 
p30) | 10? 
P5(0) 0 
R,(0) 0) 
R>(0) 0 
Volume 1.66 107 !2 
AY 10° 
do 10° 
Ly 0.1 
L2 0.1 
ky 0.5 
ko 0.5 
kat 0.00116 
kao 0.00116 
Kitk 10-9 
kik 10-* 
q 2 i 
R2 + py — Pp 
M2 
Py —> Ro+ Py 
Pi —> PEt 
k 
Ri dl 
k 
R, 
i Kok 
Pi — Pi + Ro 
-* : 
Ph —> ph+Ri, (11.1) 


where ki), and kz), are the leakage translation rate constants. Equation (11.1) gathers 10 
bio-chemical reactions. Initial species concentration, cell volume, and kinetic parameter 
values are shown in Table 11.1. 

As we have seen in Chap. 6, this model cannot reproduce bistability since cooperativity 
effects are absent. Therefore, the circuit reaches always a stable steady state where, 
depending on the circuit initial conditions, either one repressor is more expressed that 
the other or the two repressors have the same concentration. The latter case corresponds to 
the parameter values and the initial species concentrations in Table 11.1. 


11.1 Sensitivity Analysis 151 


Table 11.2 Numerical 


Parameter Value 
integration parameter values : 0 
for tasks based on steady state ee y 

Derivation factor 19-15 

Use Newton 0 

Use integration 1 

Use back integration 0 

Accept negative concentrations 0 

Iteration limit 1000 


Maximum duration for forward integration 10!5 


Maximum duration for backward integration 10° 


Before starting any circuit analysis, open the branch “Tasks” and then click on “Steady- 
State”. Here we need to modify the default values of the parameters for numerical 
integration. Set them as indicated in Table 11.2, otherwise both sensitivity and stability 
analysis will give wrong results. 

In order to run a local sensitivity analysis with COPASI, open the branch “Tasks” and 
then “Sensitivities”. In the “Sensitivities” definition panel set “Subtask” to “Steady State”, 
“Effect” to “Non-Constant Concentrations of Species”, “Cause” to “Local Parameter 
Values”, and “Secondary Cause” to “Not Set’. In the “Parameter” tabular set “Delta 
factor” to “le-06” and “Delta minimum” to “le-12”. Click on “Run”. In the new panel, 
“Sensitivities Result’, click on “scaled”. You should see the same results as in Fig. 1 1.1a. 
The sensitivity coefficients in the COPASI table describe how fast species concentrations 
at steady state change when the values of the kinetic parameters are slightly increased.! 
Green and red tonalities serve to put in evidence the parameters that cause an increase or 
a decrease in species concentrations, respectively. Parameters whose values are on a white 
background are negligible. R, for instance, tends to grow when its transcription rate, k,, 
is increased. Higher values of 42 enhance the amount of p5 and, therefore, contribute to 
an increase in Rj concentration as well. Furthermore, a reduction in R2 is provoked by 
increasing the values of kg2 and A;. As a consequence, Rj concentration at steady state 
gets higher. In a symmetrical way, a small increase in the value of ko, 41, kq1, and A2 
determines a decrease in the amount of Rj at the equilibrium.’ The two leakage constants 
are associated with sensitivity coefficients close to zero. Thus, small increments in their 
values have no effect both on R and Ro. 


'This is a local sensitivity analysis around the parameter values in Table 11.1. 

2 An increase in A, causes a decrease in the concentration of Pi: which implies a lower number of 
molecules of Rz at steady state. Hence, at the equilibrium the system has a higher concentration of 
PS: the species that leads the production of Ry. 


3Due to the circuit scheme and the parameter values in Table 11.1, analogous considerations hold 
for Ro. 
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A 


Sensitivities Result Sav 


unscaled [EIN summarized 


Rows | Target functions, Non-Constant Concentrations of Species = Columns Variables 1, Local Parameter Values B 
@ambéal)kl — (mul)ich = lambda2)kl—(mu2).41 (x2) k2 (lpn (ed 1). (2) 2 (xii k2 (421K) 1 

Ini (29.7246 eae 1-30.2195 | 30.2204 | -30.1802 | 30.6846 | -30.7236 | 30.2204 _0.0400105_-0.0393531 

(ota) =30,2595 "30.2604 (29.7639 -29.763 29.7251 -30.2201 30.2604 -29.763 — -0.0394061 _0.0387586 


(piy 0.464118 -0.464132 0.456516 0.456503  -0.455922 0.463514 -0.464132 0.456503 0.000604408 — -0.000594476 
kz} =30.2195 | 30.2204 EES +)=29:7237/ 30.6846 -30.1802 30.2204 | -30.7236 | -0.0393541 _ 0.040096 
{p2a) 29.7639 = =29.763 = 30.2595 30.2604 "~30.2201 29.7251 -29.763 30.2604 —_0.0387596 -0.0394052 
{p2q -0.456516 0.456503 "0.464118 ~0,464132 0.463514 -0.455922 0.456503 -0.464132 -0.000594491 0.000604394 


B 


Sensitivities Result Save to File 


unscaled [EEIIMI summarized 


Rows [ Target functions, Non-Constant Concentrations of Species fp Columns Variables 1, Local Parameter Values B ts 
Qambdal) a) (ru Al (Cambda2) a) (mu2) a1 (yal apa (ed) (es?) at (ellig Al (a2 th kd 

iat} 0.192714 | -0.192714 | -0.2044 0.206393 | -0.192782 |0.678248 0.204393 0.514466 —_—| -0.0116109 

total (EHTS232I) 1.19232 0.20433 -0.204328 0.192721 | -0.67801 ~0.204329 | -0.514293 0.016063 

{pid 0.000395935 -0.000395... -6.78517e.. 6.78515e-05 -6.39973e.. 0.000225147 -0.000395... 6.78517e-05 0.000170782 | -3.85408e-06 

tz] RRASENPUI24SE 0.192743 -0.192714 | HR4SE))) -O.639404 0.48507 | 0.0677541 


{o2a] 0.677784 — -0.677783 | -0.718876 (0.718859 —- -0.678023 0.385428 _-0.677784 0.718859 0.292356 ~0.040836 
(p2q -0.44678 0.446779 0.473866 ~0.473855 0.446937 0.254065 0.446779 0.473855  -0.192714 —_-0.0269181 


Fig. 11.1 Sensitivity analysis. (a) Results with the parameter values in Table 11.1. (b) New results 
after changing ky/, to 0.25s—! 


Set now ki; = 0.25s—!, ie. half of k;, and run the sensitivity analysis again. The new 
sensitivity coefficients are shown in Fig. 11.1b and are very different from the previous 


ones. This result points out that promoter leakage can affect strongly both circuit dynamics 
and steady state(s). 


11.2 Circuit Robustness 


As we have seen, promoter leakage can affect circuit performance considerably. In order 
to find out up to which extent R; and R»2 concentrations are robust to promoter leakage, we 
can vary, for instance, k1;, from 10-5 up to 0.1 s—! (i.e. 20% of k,) and calculate the steady 
state value of both Rj (Rj ) and R2 (R5°) concentration. This operation can be carried out 
via a “Parameter Scan” (to save the results, the definition of a new “Report” is required). 
In order to compute R}* and R5° for different values of kj), open the brach “Tasks” and 
click on “Parameter Scan”.* Here, select “(kIk1).k1” as an “Object”. Then, set “Intervals” 
to “500”, “min” to “le-05”, and max to “0.1”. Choose “Steady State” as a “Task”. Run 


4See Chap. 3 for general instructions about both the “Parameter Scan” task and the creation of a 
“Report”. 
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the “Parameter Scan” and open the file containing the corresponding report. You can see 
that when ky, = ko = 107>s7!, RY’ = R5° = 6.52 10~? M. However, with kijq ~ 
0.005 s~! (i.e. 1% of ky) Rj° is already much bigger than R5*° (roughly: Rj° corresponds 
to 42 molecules and R5° to 1 molecule only). This already shows that R}* and R5* (and, 
more in general, the circuit performance with the set-up in Table 11.1) are not robust 
against changes in promoter leakage. 


11.3 Circuit Optimization 


Let us suppose that we cannot engineer promoters such that their transcription initiation 
rate constant is equal to 0.5 s~! — as in our choice of parameter values — but both k; and 
ky can lie only in the interval [0.01, 0.1] s-!. Moreover, we want that, at steady state, 
R5° = 2R}°. How to find out values for k; and k2 that satisfy these two constraints? This 
question corresponds to an optimization problem whose objective function is (R2 —2R})°. 
We need to minimize this square difference in the region of the (ki, kz) parameter space 
that corresponds to [0.01, 0.1] x [0.01, 0.1]. 

After resetting k1;, to 1075 s—!, open the branch “Tasks” then click on “Optimization”. 
The “Optimization” definition panel appears. “Expression” refers to the objective function 
and should be set to “({[R2]-2*[R1])A2”. In order to write this objective function, click on 
the COPASI symbol under the word “expression”. In the new window, open the branch 
“Species” and “Transient Concentrations”. Choose “[R2](t)” and click on “OK”. In the 
expression tab, “[R2]” appears. Write a round bracket “(’ on its left and “-2*” on its right. 
Click on the COPASI symbol again and select “[R1](t)”. Complete the expression with 
“)A2” (see Fig. 11.2a). To verify that the formula is correct, click on the symbol “square 
root of x” under the COPASI symbol: if everything is fine, you should see the same as in 
Fig. 11.2b. 

Verify that “minimize” is selected and “Subtask” is “Steady-State”. Click on the 
COPASI symbol close to the “Object” bar, open the branches “Reactions”, “Reaction 
Parameters’, ‘“k1’’, and then select “k1”. Click on “OK’’. Set the “Lower Bound” to 0.01 
and the “Upper Bound” to 0.1. Set also the “Start Value” to 0.01. In order to add ky, click 
first on the symbol with a green cross and then repeat the same procedure followed for k,. 
As a “Method” choose “Simulated Annealing”. Click on “Run”. When the simulation is 
over, click on “Result” in the navigation panel, just under “Optimization”.° The column 
“Value” contains the values for k; and k2 that solve our optimization problem. To check 
that the computed values of k; and k2 are good solutions, click on “Update Model”. 
COPASI will change automatically the values of k, and k2 in the model with the new, 


>The symbol x indicates the Cartesian product. 

®You can associate each optimization with a report. Click on “Report” in the “Optimization” 
definition panel, select “Optimization” as “Report definition’, set the “Target” file that will contain 
the report, and then click on “OK”. 
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Optimization update model _) executable 
Expression (iR2}}-2*([R1])A2 @ 
{ve 


© minimize 
maximize 


Subtask Steady-State 
Randomize Start Values Calculate Statistics 
B Optimization update model _| executable 


Expression qR2]-2 {RuF 
ke 


© minimize 
maximize 
Subtask Steady-State 
Randomize Start Values Calculate Statistics 


Fig. 11.2 Optimization definition panel. (a) Objective function in COPASI syntax. (b) Objective 
function as a mathematical expression 


just calculated ones. Run a Steady State analysis: in the “Steady State Result” window 
(under “Species”) you can see if the condition specified by our objective function is 
verified. If, now, you run another optimization (after resetting both k; and k to their initial 
values), a new pair of k; and k2 will be calculated. Apparently, there are many possible 
solutions for this particular optimization problem. To limit their number, we can specify 
a further constraint. Let us require, for instance, that 2.0 10-9 < Ry < 5.0107? ice. the 
number of molecules of R,, at the equilibrium, is bigger than 2 and lower than 5. On the 
“Optimization” definition panel click on “Constraints”, then click on the COPASI symbol 
close to the “Object” bar, open the branches “Species”, “Transient Concentrations”, and 
select “[R1](t)”. Click on “OK”. Set “Lower Bound” to “2e-09” and “Upper Bound” to 
“5Se-09”. Click on “Run” to start the optimization procedure. Once the optimization is 
over, check if the results are correct, as explained above. 

Run new optimization procedures with different methods such as “Genetic Algorithm” 
and “Steepest Descent” (set, every time, “0.01” as a starting value for both k; and kz). The 
former algorithm will return, in general, good solutions. The latter one, in contrast, will 
fail to find a solution for our problem. 


11.4 Model Fitting 
Parameter values can be estimated by fitting a model to experimental data. Let us consider 
a simple circuit (like to the one in Fig.2.5) where fluorescence is expressed only in 


presence of an inducer, i. By using a Hill-function, the system is described by three 
interactions 
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ki H(n, Ky, i) 
tr A F 


kp 
—> 
r & , (11.2) 
that lead to the following ODE to calculate the dynamics of F 
dF Cr 
22 Sp a, (11.3) 
dt 1+ ( = yn 


Let us set kj, = 5107!°M/s, ky = 107!2M/s, and kg = 0.00116s~!. Moreover, 
let us suppose to have two sets of experimental data, where the steady-state values of the 
concentration of F have been measured for more than 20 different concentrations of i. 
This data are written into the files “il.txt” and “i2.txt” (see Tables 11.3 and 11.4). Our goal 
is to find values for n and Ky that fit the model in Eq. (11.3) to the data collected into 
“Z1.txt” and “12.txt’”. In the navigation panel select “Tasks”, then “Parameter Estimation”. 
The “Parameter Estimation” definition panel is similar to the “Optimization” one. As an 
object select first Kj’ and set the “Lower Bound” to 10~? M and the “Upper Bound” 
to 10-°M. Afterwards, choose n as an object with boundaries 1 and 3.° Check the box 
“all” corresponding to “Affected Experiments”. Click on “Experimental Data”. In the 
new windows that appears on the screen click on the symbol with the green cross beside 
“File” and load both “il.txt” and “i2.txt”.? “Experiment Type” should be “Steady State”. 
In the panel below, set the “Type” of the “input” column as “independent”. Click on the 
COPASI symbol on the right and select “Species”, “Initial Concentrations”, “[i](t=0)”: 
“{i]_0” is written under the column “Model Object’. Set “output” as “dependent” and 
associate it with “Species”, “Transient Concentrations”, “F](t)”!° (see Fig. 11.3a). Click 
on “OK”. Back to the “Parameter Estimation” definition panel select “Genetic Algorithm” 
as a “Method”, then click on “Run’’. Click on “Results” under “Parameter Estimation” in 
the navigation panel. Click on “Parameter” on the “Parameter Estimation Result” panel. 
The computed values should be very close to 10~’ M for K y and 2 for n (see Fig. 11.3b).!! 


Tif you defined it as a “Global Quantity”, you will find it under “Global Quantities”, “Initial Values”. 
8Both for K H and n use as “Start Value” the value of the “Lower Bound”. 


9As in Tables 11.3 and 11.4, the two files start with a header row with the names of the two 
columns: “input” and “output”. The first column contains the concentrations of i, the second one 
the concentrations of F’. On each row, concentration values are separated by a “tab”. 

10Click on “i2.txt”. “Copy Settings” is set, by default, to “from previous”, which is fine with our 
simulations. 


'1 The data files were indeed generated with these values and then modified by hands. 
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Table 11.3 “il.file” 


Input 
le-09 
2.595e-08 
5.09e-08 
7.585e-08 
1.008e-07 
1.2575e-07 
1.507e-07 
1.7565e-07 
2.006e-07 
2.2555e-07 
2.505e-07 
2.7545e-07 
3.004e-07 
3.2535e-07 
3.503e-07 
3.7525e-07 
4.002e-07 
4.2515e-07 
4.501e-07 
4.7505e-07 
5e-07 


Take Home Message 
Among its features, COPASI offers an intuitive implementation of local sensitivity 


analysis that, returns the sensitivity coefficients at steady state. 


Output 
9.0e-10 
2.6e-08 
8.2-08 

1.2-07 

2.4e-07 
2.9e-07 
3.1e-07 
3.2e-07 
3.2e-07 
3.6e-07 
3.9e-07 
3.8e-07 
4.0e-07 
3.9e-07 
4.2e-07 
4.2e-07 
4.0e-07 
4.0e-07 
4.2e-07 
4.1e-07 
4.2e-07 


Circuit optimization aims at improving circuit performance. This operation requires 
to compute new values of one more kinetic parameters and can be carried out via a 


stochastic algorithm. 


COPASI permits to fit a model to experimental data via the “Parameter Estima- 


tion” task. 


11.4 Model Fitting 


Table 11.4 “i2.file” 
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Input Output 
le-09 9.1e-10 
2.595e-08 | 1.6¢e-08 
5.09e-08 7.2-08 
7.585e-08 | 1.2-07 
1.008e-07 | 2.3e-07 
1.2575e-07 | 2.6e-07 
1.507e-07 | 1.8e-07 
1.7565e-07 | 3.2e-07 
2.006e-07 | 3.9e-07 
2.2555e-07 | 3.6e-07 
2.505e-07 | 3.2e-07 
2.7545e-07 | 3.8e-07 
3.004e-07 | 4.3e-07 
3.2535e-07 | 3.2e-07 
3.503e-07 | 4.6e-07 
3.7525e-07 | 4.1e-07 
4.002e-07 | 3.8e-07 
4.2515e-07 | 4.1e-07 
4.501le-07 | 4.2e-07 
4.7505e-07 | 4.2e-07 
Se-07 4.1e-07 
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Fig. 11.3 Parameter estimation. (a) “Experimental Data” window. (b) Result panel 
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What You Will Learn in This Chapter 

Gene circuit motifs are structural patterns associated with specific functions. In genetic 
networks, motifs are made of a small number of transcription units. A single transcription 
unit that contains a regulated promoter is sufficient to have either a negative or a positive 
feedback loop. Although very simple, they carry out important tasks such as speed up 
or slow down the circuit response time, achieve homeostatis (see Chap. 10) or obtain 
bistability (see Chap. 6). The latter is an essential feature to build genetic memory devices. 
Like these feedback loops, most of the motifs presented in this chapter are based on 
transcription regulation. Logic operations (Boolean gates) can be carried out by controlling 
translation as well, either with structures such as riboswitches and ribozymes or via 
RNA interference (see Chap. 8). We considered as motifs also cell consortia. They are 
populations of cells where a given function emerges from the interactions of sets of cells 
devoted to different tasks. As we will see, digital circuits have been engineered as S. 
cerevisiae cell consortia, where some cells sense the inputs and other perform the logic 
operation and express a fluorescent output. Finally, we included among circuit motifs 
re-engineered pathways too. We will show that natural signaling pathways have been 
modified, both in eukaryotic and bacterial cells, to either respond to an input or express 
an output different from the original one. Complex circuits can potentially arise by the 
composition of all these basic motifs. 
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12.1 Transcriptional and Translational Controls 
12.1.1 Constitutive Gene Expression 


As a reference for the analysis of more complex motifs we take a transcription unit where 
a promoter (without any regulation) leads the synthesis of a protein, P. If we omit to 
consider the promoter explicitly, we can model this simple system with a single species 
(the protein P) and two reactions: constitutive protein synthesis (Oth-order reaction) and 
protein degradation (1st-order reaction). Hence, we have 

+“, p 


ps (12.1) 


The reactions in Eq. (12.1) determine the following ODE 


dP 
—=k,—kgP (12.2) 
dt 


that, as we have seen in Chap. 2, admits as a solution the function 
ks —kat 
P(t)=—(1-e™), (12.3) 
ka 


where we have assumed that P(0O) = 0. At steady state P concentration is equal to iz. 


The response time of the system (f1) i.e. the time P takes to reach half of its steady state 
concentration (see Chap. 10) is given by 
In(2) 


= (12.4) 


Therefore, t1 is completely determined by the protein decay rate constant kg. 
2 


12.1.2 Negative Autoregulation (Negative Feedback Loop) 


We have already met this motif in Chap. 10 and we know that it serves to stabilize the 
steady state concentration of a protein P, as in Fig. 12.1a. Moreover, a negative feedback 
loop shows a response time shorter than that of the constitutive gene expression motif. 
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A B 


Fig. 12.1 (a) Negative feedback loop. (b) Positive feedback loop. (c) A positive feedback loop has a 
single stable steady state in the absence of cooperativity (n = 1). (d) In the presence of cooperativity 
(n > 1), more than a single steady state is possible. Here we show how the number of steady states 
varies for different values of ky. Stable steady states are colored in green, the only unstable steady 
state is red, whereas bifurcation points are colored in yellow 


12.1.3 Positive Autoregulation (Positive Feedback Loop) 


Compared to the constitutive gene expression, a positive feedback loop shows a longer 
response time. This can be proved by following the same procedure as in Chap. 10, where 
we calculated the response time of a negative feedback loop. Protein P, as depicted in 
Fig. 12.1b, obeys the ODE 


dP 
= =k +k 


ss —kgP = f (P), 12. 
ii “PaK,  * f(P) (12.5) 


where k; is the leakage rate constant, k, the translation rate constant, Ky the Hill 
constant, and kg the protein decay rate. Equation (12.5) neglects cooperativity effects (Hill 
coefficient: n = 1). Under this assumption, the system has a single steady state (P**) 
given by the relation kg P — k; = ky PUG (see Fig. 12.1c). The response time is inversely 
proportional to 


af (P) 
oP 


(Ky + Pss)2 


ky K 
= ( = ku) =ki=b (12.6) 
Pss 


162 12 Circuit Motifs 


where b = More precisely 


ksKy 
(Kut Psy" 


In2 In2 
= >. (12.7) 
2 kd—b kd 


If molecules of P bind cooperatively (n > 1) to their target promoter, the autoregulation 
motif shows bistability and hysteresis (as in the Toggle Switch — see Chap. 6). In this case, 
Eq. (12.5) is rewritten as 


ee +k es kgP (12.8) 
dt | phy Ke 
and the steady state condition implies that 
Pp” 
ks ———— = kgP — ky. 12.9 
Ss Pr + ia d i ( ) 


Equation (12.9) is plotted in Fig. 12.1d. Bistability arises if the translation rate constant ks 
and the decay rate constant kg are balanced. If we suppose to keep ks constant and modify 
kq only (this is a reasonable wet-lab strategy), the system will have a single stable steady 
state at either very high values of kg (i.e. short half-life and a consequent small amount 
of P at the equilibrium) or very low kg values (opposite situation: long half-life and high 
concentration of P at the equilibrium — green squares in Fig. 12.1d). In between, the circuit 
shows bistability and can fall into either stable steady state (green circles in Fig. 12.1d, the 
red one is an unstable steady state) depending on its past conditions. ! 

Positive autoregulation provides a way to engineer memory systems. Ajo-Franklin et al. 
[1] constructed a galactose sensing device and then turned it into a memory by adding 
autoregulation via a positive feedback. 

The two circuits, shown in Fig. 12.2, were built in yeast and made use of an orthogonal 
activator i.e. a chimeric protein made of the bacterial LexA-DBD (DNA binding domain) 
fused to the VP64? activation domain and a nuclear localization sequence (NLS). In the 
galactose-sensing device, the activator P (LexA-DBD — VP64 — NLS) is expressed under 
the control of the yeast GALI promoter (Pg,4zx1) that is activated by galactose. Once 
produced, P binds eight /ex operators (lexOp) placed in front of the minimal CYC1 
promoter (Pmincyci) and turns on the expression of a yellow fluorescent protein (YFP 
— see Fig. 12.2a). This sensor gives a fluorescent signal only in the presence of galactose. 
When galactose is removed and cells are resuspended in a glucose-containing medium, 
fluorescence disappears after several cell duplications (dilution effect). The memory 


Yellow triangles in Fig. 12.1d are bifurcation points where a stable and an unstable steady state 
coexist. Green triangles, in contrast, are stable steady states. 


2From the herpes simplex virus. 
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A Galactose sensor 


galactose 


lexOpg-Prrincyct 


B Memory circuit 
galactose 


lexOpg-Prrincyct 


Fig. 12.2 (a) Galactose sensing device. (b) Memory circuit 


Fig. 12.3 Memory circuit 
analysis. (a) Two stable steady 
states (green circles —- memory 
behavior) are obtained with a 
proper balance in the 
production and degradation of 
the activator P. (b) Bistability 
is obtained by increasing the 
half-life of the activator P. (c) 
As an alternative strategy, 
bistability can be achieved by 
using two copies of the plasmid 
encoding for the fusion protein 
YFP-P [1] 


A 


degradation memory 


dP/dt (nM/s) 


P+ YFP-P (nM) 
B C 

90 min 2 copies 
a zs 
= = 
s 120 min s 1 copy 
= g 

P+ YFP-P (nM) P+ YFP-P (nM) 


device, in contrast, goes on producing the output (YFP) after galactose removal. The 
positive feedback is built by fusing YFP to P (Fig. 12.2b). The in vivo implementation of 
this synthetic memory was driven by theoretical considerations and computer simulations. 
The growth medium and the copy number of the gene encoding for the YFP-P fusion 
protein were modified in order to approach the values of ks and kg that were predicted 
to confer bistability to the circuit i.e. to lock the system into the high-fluorescence steady 
state after induction with and successive discharge of galactose (see Fig. 12.3). 
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12.1.4 Oscillatory Systems 


In the Repressilator, sustained oscillations arise not only from the choice of parameter 
values (as we have seen in Chap. 6) but also from the ring structure in which the three genes 
are placed (Fig. 12.4a). Only some circuit configurations are able to produce oscillations. 
If we assign a minus (—) to each repression and a plus (+) to each activation in a 
circuit where genes are arranged into a closed loop (such as a ring), the circuit will be 
able to sustain oscillations only if the overall sign of the scheme (i.e. the product of all 
repression/activation signs) is a minus (see Fig. 12.4b). Beside the Repressilator, another 
oscillatory circuit was engineered in E. coli. It is known as “the positive and negative 
feedback oscillator” [5]. Here, gene A undergoes positive and negative regulation via 
two different feedback loops (see Fig. 12.4c). Stability analysis showed that parameters 
quantifying the strength of both activation and repression needed to be chosen carefully 
in order to have sustained oscillations instead of damped oscillations or a toggle switch 
behavior. Moreover, the stability diagram pointed out that the easiest way to convert the 
circuit from an oscillator into a toggle switch was to remove the activation of gene B by 
gene A (kAmg = 0, see Fig. 12.4c-e). 


12.1.5 Boolean Gates (Transcription Regulation) 


In electronics, Boolean gates are basic components of digital circuits. Their input and 
output take only two values: high (1) and low (0). In Synthetics Biology, Boolean gates 
can be used to trigger a decision (as in the “RNAi-based logic evaluator’, see Chap. 8), 
sense one or more substances in the environment (biosensors) or perform computations. 
Promoters can be engineered in order to mimic logic behavior under the control of 
repressor and activator proteins. Most of the designs shown in this paragraph hold for 
E. coli promoters and are discussed in [10]. It should be noted that their inputs are 
transcription factors, although it is more common to refer to chemicals as inputs for gene 
Boolean gates. 

In order to show a proper logic behavior, the 0 and 1 outputs (e.g. fluorescence levels) 
should be well-separated. A promoter regulated by a single repressor (e.g. LacI, TetR or 
cI) can be used to implement a NOT gate (Fig. 12.5a). In order to make repression stronger 
and decrease the 0 output, two operators can be placed in the —35.. — 10 promoter region. 

Two different repressors, R; and Ro, acting on the same promoter are required to have 
a NOR gate, which produces high fluorescence only in the absence of both repressors 
(Fig. 12.5b). 

A NAND gate demands two different repressors too but they have to bind cooperatively 
(Fig. 12.5c). We can suppose that Ro has a low affinity towards its own operator and binds 
the DNA poorly. However, R2 operator overlaps the RNA polymerase binding site and Ro 
would prevent transcription once bound to the DNA steadily. R; has a higher affinity to its 
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Fig. 12.4 Oscillatory systems. (a) Repressilator scheme. (b) General schemes for oscillators and 
toggle switches. (c) The positive and negative feedback oscillator in [5]. (d) Stability diagram. 
kAmpg, kBm,g, and kAma are rate constants associated with the production of an mRNA species 
(m, or mpg) under the control of a protein (the activator A or the repressor B). (e) Toggle switch 
built on the positive and negative feedback oscillator. The circuit has no longer a closed structure 


own operator that, however, lies far from the transcription initiation region. Hence, R1 per 
se cannot repress transcription. In the presence of hetero-cooperativity, R, (once bound 
to its operator) helps R2 bind the promoter by, for instance, rotating the DNA sequence 
and increasing, as a consequence, the affinity between R» and its binding site. Thus, only 
in the presence of both repressors the gate output is low, whereas in the other three cases 
fluorescence is high. 

An AND gate demands hetero-cooperativity between two activators: A; and A2 
(Fig. 12.5d). In this case, only Az can recruit RNA polymerase and initiate transcription. 
However, Az has a low affinity to its own operator and is able to bind the promoter only 
when A, is already bound to the DNA. A, binds its own operator strongly and increases 
the affinity of Az towards its binding site. Hence, fluorescence output is high only when 
both activators are present. Notice that, in general, engineering hetero-cooperativy is not 
banal and alternative designs have been proposed (see below). 

OR logic operation wants two activators as well. However, they can recruit RNA 
polymerase independently. Therefore, the output is high when at least one activator 
is present. Bintu et al. [10] pointed out that bacterial promoters showing synergistic 
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Fig. 12.5 Boolean gates — with their truth tables — based on transcription regulation (i: input, o: 
output, FP: fluorescent protein, RNAp: RNA polymerase). Lines ending with a circle represent 
molecule recruiting. (a) A NOT gate takes a repressor as an input. (b) A NOR gate demands two 
different repressors as inputs. (c) A NAND gate is realized with two different repressors too. (d) An 
AND gate employs two activators. (e) An OR gate uses two activators as well. (f) An N-IMPLY gate 
arises from the binding to the DNA of one repressor and one activator 
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activation can be used to build OR gates. In this case, when both A, and Az are in the 
cell, they recruit RNA polymerase together. As a result, they induce a fluorescence level 
much greater than the sum of the levels due to each activator separately. 

Finally, a promoter regulated by an activator and a repressor corresponds to the N- 
IMPLY logic operation, giving high fluorescent only when the activator is present and the 
repressor absent (“10” entry in Fig. 12.5f). 


12.1.6 Boolean Gates (Dimer Systems) 


An alternative AND gate design based on transcription regulation was presented by Wang 
et al. [45]. Implemented in E. coli, this AND gate exploits the proteins HrpR and HrpS* 
from Pseudomonas syringe. When both HrpR and HrpS are produced, they dimerize and 
recruit a particular RNA polymerase,’ responsible for tight gene activation, to the hrpL 
promoter (pHrpL — see Fig. 12.6a). The AND gate is a module made of two CDSes, two 
terminators, and one promoter. It can be connected to two RBSes on the input side and one 
RBS on the output side. The authors engineered also a NOT gate as a module orthogonal 
to the AND gate. The NOT gate module is based on the cl repressor acting on the A phage 
promoter (P, — see Fig. 12.6b). 


A 
ChpR AND gate : 
iy _ _ | output 
2 -—-Z3s «oe 
Pa or °° 8 4 eis 
CO hrpS pHrpL 
B 
NOTgate a. | a 
2ra DT N ADT 
C p: 


Fig. 12.6 Boolean gate design as in [45]. (a) AND gate based on the HrpR/HrpS system. (b) NOT 
gate based on the cI repressor 


3Hrp stands for hypersensitive response and pathogenicity. 


4Tt contains the o> domain. 
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Several variants of both AND and NOT gates were assembled. Some of them, once 
connected to each other, formed working, modular NAND gates. AND and NOT gate 
variants were tested by changing only the RBSes on the input side. 


12.1.7 Boolean Gates (Translation Regulation) 


As we have seen in Chap. 8, the RNAi-based logic evaluator is made of two AND gates 
regulated by siRNAs. A different way of building Boolean gates by acting on translation 
regulation requires to employ mRNA structures such as riboswitches or ribozymes. They 
are bound by chemicals and, therefore, allow a direct control of translation. Riboswitches 
and ribozymes contain one or two aptamers (the chemical binding sites) and an actuator. 
In a riboswitch, the actuator changes its secondary structure upon chemical binding. 
As a consequence, a hurdle to ribosome binding is removed or created. Therefore, a 
riboswitch can either allow or prevent translation initiation. Generally, riboswitches are 
placed on the 5’UTR of the mRNA. As for ribozymes, the actuator becomes either active 
or inactive upon the arrival of a chemical to the aptamer. When the actuator is active, a 
ribozyme autocleaves and, in this way, regulates translation. Win and Smolke presented 
a library of Boolean gates based on ribozymes placed on the mRNA 3’UTR [46, 47]. 
Upon autocleavage, these synthetic ribozymes repress translation by inducing mRNA 
degradation. They are built in a modular way by joining an aptamer to an actuator (a 
hammerhead ribozyme) via a transmitter (a short RNA sequence-see Fig. 12.7a). Several 
two-input gates were engineered according to the following three schemes 


1. two independent ribozymes (Fig. 12.7b), 
2. a single ribozyme with two aptamers, one on each stem (Fig. 12.7c), 
3. a single ribozyme with two aptamers on the same stem (Fig. 12.7d). 


Single ribozymes can be either buffer - YES — gates (they are referred to as L2bulgel 
in the works by Win and Smolke) or inverter - NOT - gates (L2bulgeOff1). A chemical 
binding L2bulgel inhibits the ribozyme catalytic activity; opposite result has a chemical 
on L2bulgeOff1. Following scheme 1, an AND gate is engineered by placing two buffer 
gates on the mRNA 3’UTR. In contrast, a NOR gate is built with two inverters. Both 
gates work only when the catalytic activity of the two ribozymes is inhibited: the AND 
gate requires the presence of the two chemicals, the NOR gate their absence. A NAND 
gate was engineered with scheme 2. In this configuration, the ribozymes is active (zero 
output) only when both aptamers (inverters) are bound by their input chemicals. Another 
AND gate, which responds to theophylline and tetracycline, was constructed according to 
scheme 3 (see Fig. 12.7e). In this design, a chemical binds its corresponding aptamer and 
forces a structural change in the downstream sequence (another aptamer or the actuator). 
In the absence of the two inputs, the ribozyme is active. Thus, no output (fluorescence) is 
produced. The aptamer where tetracycline binds keeps the aptamer downstreams “closed” 
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Fig. 12.7 Synthetic ribozymes. (a) Modular construction. (b-d) The three possible schemes for 
engineering 2-input Boolean gates. (e) An AND gate based on scheme 3 


to theophylline.” Upon tetracycline binding, the structure of the theophylline-responsive 
aptamer changes i.e. theophylline can bind. Upon theophylline arrival, the configuration 
of the actuator switches to inactive and fluorescence expression starts. Hence, the output 
is 1 only when both tetracycline and theophylline are bound to the ribozyme.° 


12.1.8 Gene Cascade: Delay and Ultrasensitivity 


A cascade is made by a “row” of genes, the first regulating the second one, the second 
the third one and so on. Each regulation adds a step to the cascade. Hooshangi et al. 
[25] engineered in E. coli three different gene cascades by using repressor proteins (TetR, 
Lacl, and cI — see Fig. 12.8). The dynamics of each cascade was controlled with a single 
chemical: tetracycline, which acts on TetR. The output, produced by the last step in the 
cascade, was the yellow fluorescence protein (YFP). Initially, each circuit reached a steady 


5This means that theophylline cannot bind the ribozyme in the absence of tetracycline. 


©A buffer gate was realized according to scheme 3 by using two aptamers bound by theophylline. 
This YES gate showed moderate cooperativity effects (Hill coefficient: n = 1.65). 
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Fig. 12.8 Gene cascades. Circuits implemented in [25] 


state in the absence of tetracycline. Induction with tetracycline determined an increase in 
fluorescence in the 1- and 3-step cascade and a decrease in fluorescence in the 2-step 
cascade. 

The comparison of the response time at different steps quantified the delay in the output 
production due to the cascade length (see Fig. 12.9a). 

This motif can be used to establish a temporal program of gene expression. Long 
cascades, moreover, behave as low-pass filter i.e. they are insensitive to short input pulses 
(the 3-step cascade did not respond to tetracycline inductions shorter than 15 min). Another 
important feature of gene cascades is ultrasensitivity to the input signal (see Fig. 12.9b). 
Ultrasensitivity means that the output/input curve has a sigmoidal shape (Hill function) 
due to cooperative effects. Hooshangi et al. measured an increase in the Hill coefficient, 
n, from 2.3 (1-step) to 7.5 (3-step). This result might be due to the particular propagation 
of the input signal to the output (YFP) through different repressor proteins. It should be 
noted that a working gene cascade is not easy to engineer in vivo because of two relevant 
drawbacks 


1. the noise in the output is amplified at intermediate fluorescence values for high cascade 
length. This might prevent synchronisation of cell response over a population, 

2. acareful fine-tuning of kinetics parameters (such as promoter strength and leakage) is 
necessary in order to assure a proper signal propagation along the cascade. 
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12.1.9 The Feed Forward Loop (FFL) 


A Feed Forward Loop is made of three genes: X, Y, and Z. X regulates Z along two 
branches: a direct regulation and an indirect one that passes through Y (see Fig. 12.10a). 
Since each interaction can be either a repression or an activation, 8 FFLs are possible. 
Moreover, FFLs can process in different ways the two inputs at gene Z (common 
operations are logic multiplication and sum). If the regulation signs along the two branches 
are the same, the FFL is called coherent (C), otherwise incoherent (I). The I1 (incoherent 1) 
FFL is a recurring scheme in natural networks. Here, X activates Z along the direct branch 
and inactivates it on the indirect path (X activate Y that inactivates Z — see Fig. 12.10b). Z 
works as an N-IMPLY gate (logic multiplication of the inputs). 

I] FFL can be used as a pulse generator. A pulse generator permits to have a transient 
response to a long-lasting input (similar to what happens with exact adaptation in bacterial 
chemotaxis — see Chap. 10). The pulse generator implemented in E. coli by Basu et al. [8] 
is shown in Fig. 12.10c. Once activated by AHL (acylated homoserine lactone) inducer, 
LuxR binds the two downstream promoters (lux Pr and lux Pr-cI Or1) and switches on 
the synthesis of both GFP and cI repressor. GFP level rises until enough cI is produced 
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Fig. 12.10 Feed forward loop. (a) General scheme. (b) I1 FFL. (c) Pulse generator scheme [8]. (d) 
Comparison of the response time of the I1 FFL and the “constitutive gene expression” motif 


to bind and fully repress ux Pr-cl Or. At this point, GFP’ starts decaying to its steady- 
state level (a typical pulse shape is shown in Fig. 12.10d). In general, a pulse signal can 
be characterized by several parameters (e.g. rise time, fall time, width, gain®). Basu et 
al. highlighted, with experimental results, some important features of the I1 FFL. For 
instance, they pointed out that this motif can be reset to the original state after being 
exposed to AHL. In their experiment, cells were induced with 140nM AHL for 4h during 
which a pulse in fluorescence was registered. Cells were then washed to remove AHL 
and let grow for 6 more hours in the absence of AHL. Finally, cells were re-induced with 
140nM AHL. A second fluorescent pulse with the same intensity as the first one was 
detected. In another experiment, Basu and co-authors measured the refractory time i.e. the 
time the circuit needed to respond to a second input signal after a first induction with AHL. 
In this case, cells were induced with 140nM AHL for 6h, then washed and resuspended 
in fresh medium without AHL. From that moment, aliquots were taken every 10 min and 
re-induced with 140nM AHL for 20 min before measuring their fluorescence. A second 
pulse (as high as the first one) was detected after about 140 min. The refractory time was 
mainly due to the decay of the cI proteins that downregulated GFP production. Further 
experiments pointed out that pulse height was dependent on AHL concentration, although 
no big variations were observed from 47 up to 4700 nM. Moreover, by inducing the circuit 


7A destabilized version of the GFP (shorter half-life) was used in [8]. 
8The gain, G, of a pulse signal is defined as: G = MAX~E 


peak and E that of the steady state. 


, where M AX is the value of the pulse 
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at different AHL rates, pulse height and shape were shown to change. High AHL rates 
caused a fast production of both GFP and cI with a high fluorescent peak and a quick 
shut-off of the ux Pr-cI Or1| promoter. Low AHL rates determined a slow production of 
both GFP and cI corresponding to a lower but broader peak and a delayed repression of 
lux Pr-cI Or1. These results proved that E. coli cells re-engineered with this particular I1 
FFL were able to sense the time derivative of AHL concentration. 

I1 FFL performs another function: it speeds up the response time with respect to the 
“constitutive gene expression” motif provided that the two motifs share the same steady 
state. An intuitive proof of this feature is given in [2]. To summarize it, let us call A, R, 
and Z the three proteins involved in the I] FFL (the activator, the repressor, and the output, 
respectively). A exists in two configurations: inactive (A’, the wild-type configuration) and 
active (A“, upon binding the FFL input chemical). Once a signal is sensed (like AHL in 
[8]), Z is produced under the action of A“. Z concentration grows until R reaches the 
necessary amount to contrast A“ and limit Z synthesis. Therefore, up to the pulses peak, 
the temporal behavior of Z is approximated by an ODE that neglects the presence of R in 
the system 


Z 
aie — = az, (12.10) 
H 


where k; is the leakage rate constant, k, the translation rate constant, kg is the decay rate 
constant associated with Z and, as usual, n and Ky represent the Hill coefficient and 
constant. Let us suppose that, when the FFL is induced with the input signal, A’ is already 
at steady state and gets activated in a very short time (fast interaction with the input). 
Therefore, A“ reaches its equilibrium quickly and we can simplify Eq. (12.10) by replacing 
the quantity kj + ks Ketan with a constant, A. Thus, Eq. (12.10) becomes 


dZ 
—=)-kgZ (12.11) 
dt 
that admits as a solution 
r —kat 
Z(t) = —(l-e ““’) (12.12) 
ka 


with Z(0) = 0. Equation (12.12) gives the dynamics of Z from the induction time (¢ = 0) 
to the pulse peak (¢ = tp). From tp on, we can consider that Z is produced with a new rate 
ju < X because of the high amount of R in the system. Hence, Z settles to a steady state 
concentration Z** = re The I1 FFL response time t 1 follows from 


ioe x a Hath) 
Se _— 2 
ae Sue 
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ea 
1 2a 
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A “constitutive gene say eee motif with the same Z** steady state concentration has 
a response time tT = me . Since « < i,’ we have that 1 < t (see Fig. 12.10d for a 
graphical scpreccntionn: Therefore, the I1 FFL — similarly to the negative feedback loop 
— shows a response time shorter than that of an unregulated transcription unit that reaches 
the same steady state. 


12.2 Cell Consortia 
12.2.1 Quorum Sensing 


The circuits we have seen so far were fully implemented into single cells. In an alternative 
design, circuits are organised into groups of interacting cells, each group carrying out a 
different function. Such a structure of collaborating cells is called cell consortium. In a 
consortium, some cells have the task to process the input, other to perform a particular 
operation and produce the output. Cells are wired together via small molecules such as 
chemicals and hormones. 

Quorum sensing, i.e. the mechanism through which some bacteria control gene 
expression in response to cell density, implements a kind of communication among 
cells via AHL molecules and can be used to engineer synthetic consortia. In the marine 
symbiotic bacterium Vibrio fischeri, the protein LuxI is responsible for the production 
of AHL that diffuses through the cell membrane and accumulates in the environment. 
When bacterial cells reach high density, AHL “goes back” into the cells where it binds 
and activates LuxR protein that triggers luminescence production (see Fig. 12.11a). In a 
synthetic circuit, LuxR can be exploited to drive the synthesis of the read-out. Basu et al. 
[8] placed the I1 FFL that we have seen in the previous section under a quorum sensing 
control. The consortium they engineered is made of sender cells, which produce AHL, 
and receiver cells, where a pulse in fluorescence is generated. Moreover, LuxI and AHL 
synthesis takes place only in the presence of tetracycline (see Fig. 12.11b). Basu et al. 
showed that the receiver response depended on the distance from the sender when cells 
were grown on solid medium. Spatial distance modulated the rate at which AHL bound 
LuxR. The shorter the distance, the faster the response and the higher the pulse. No pulse 
was detected when the distance between the sender and the receiver cells was greater than 
or equal to 4.5 mm. 


an = ah, wherea < 1. 
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Fig. 12.11 Quorum sensing. (a) General principle. (b) The complete scheme of the synthetic gene 
circuit engineered in [8] 


12.2.2 Yeast Consortia 


Regot et al. [42] engineered Boolean gates and small digital circuits via cellular consortia 
in yeast. In this circuit design, a cell is the basic module to carry out a Boolean function. 
Cells react to one or two inputs and produce either a yeast pheromone (input for other cells) 
or the circuit output (fluorescence). Only four kinds of basic Boolean gates are required: 
buffer (YES) and NOT, responding to a single input; N-IMPLY and AND, sensing two 
inputs. Input to the circuit (external inputs) are: NaCl, doxycycline, B-estradiol, 6a (an 
inhibitor of kinase Fus3), and galactose. Internal inputs (wiring molecules) are the a factor 
from S. cerevisiae (as-) and Candida albicans (acq). Computational analysis showed 
that with up to four cells and 3-4 different wires more than 200 (out of 256) 3-input 
functions can be realized. Hence, compared to transcriptional networks, cell consortia 
might decrease the complexity of Boolean circuits drastically. A YES gate needs two cells. 
One senses the circuit input (galactose) and only in its presence produces the a factor of 
S. cerevisiae. This pheromone activates the yeast mating pathway into the second cell 
(“Cell 6” — see Fig. 12.12a), which leads the production of GFP.'!? A NOT gate shows 
a similar architecture (Fig. 12.12b). Here, however, the first cell senses doxycycline that 
inhibits as, production. AND and NOR gates require that the cell that expresses GFP 
senses an external input signal as well (Fig. 12.12c-—d). In particular, “Cell 4” in the NOR 
gate implements an N-IMPLY function. The OR gate in Fig. 12.12e is made of three cells. 
Two different cells sense two distinct inputs (NaCl and galactose) and produce the same 


!O0GFP expression is controlled by Fus! promoter, a component of the yeast mating pathway (see 
Sect. 12.3). 
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Fig. 12.12 Boolean gates and digital circuits realized with cell consortia in yeast [42]. (a) YES 
gate sensing glucose. (b) NOT gate responding to doxycycline. (c) AND gate taking NaCl and beta- 
estradiol as inputs. (d) NOR gate processing doxycycline and 6a. (e) OR gate detecting either NaCl 
or galactose (or both inputs). (f) The MUX2tol multiplexer 
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wiring signal (as,) that acts on a unique output-cell (“Cell 6”). A more complex circuit, 
termed MUX2tol multiplexer, is built according to the distributed output architecture 
(Fig. 12.12f). This circuit corresponds to the Boolean formula (a A c) V (a4 A b) where 
a is doxycycline, b galactose, and c B-estradiol. When a = 1, the output corresponds to 
the value of c. Vice versa, when a = 0 the output corresponds to the value of b. In this 
circuit, the OR operation is achieved by distributing the output production among the cells 
that carry out the AND and the N-IMPLY function.!! Finally, the MUX2tol needs two 
wiring molecules, as, and acq, in order to connect cells without any cross-talk. 


12.2.3 Spatial Modeling 


How to take into account, in a model, the movement in the three-dimensional space of 
the molecules exchanged in a cell consortium? The problem of modeling spatial motion 


'1 This architecture was adopted, previously, by Rinaudo et al. in the RNAi-based logic evaluator 
[43] — see Chap. 8. 


12.2 Cell Consortia 177 


arises also in different contexts, for instance when dealing with eukaryotic transcriptional 
networks. Models for genetic circuits in eukaryotic cells demand to consider the transport 
of mRNA, from the nucleus to the cytoplasm, and proteins — from the cytoplasm to the 
nucleus. The diffusion equation permits to describe the concentration of a species c as a 
function of both time (f) and spatial coordinates, r = (x, y, z). It states that 


dc(r, t) 


eo DV? c(r, t) (12.14) 


where D is the diffusion constant of c and V* = ae + ir + # is the Laplacian operator. 
To solve the diffusion equation, one has to specify the initial concentration of c — c(r, 0) - 
and the boundary conditions c(ro, t), where rg are the points that limit the spatial region 
in which the diffusion equation is solved. One, for instance, can simply set c(r9, t) = k, 
which means that c is constant at the boundary points. The diffusion equation is an example 
of PDE - Partial-Derivative Differential Equation. 

A system of N molecules that participate in M bio-chemical reactions and move in the 
space is described by the reaction-diffusion equation 


M 
dc; (r,t) = 2 
a = Y (v5 + vj;) + DiV'ci(r, t) (12.15) 


j=l 


where v;; are the fluxes associated with the jth reaction. They either increase (+) or 
decrease (—) cj (r,t), i = 1,..., N. This approach should be followed when studying, 
for instance, pattern formation since the spatial distribution of the molecules of different 
species has to be determined. In contrast, if one is interested in simulating only simple 
transport mechanisms, it is possible to avoid taking into account spatial coordinates 
explicitly and treat the effect of the transport as a time delay in species dynamics. In 
this way, species concentrations over a time interval are determined by DDEs (Delay 
Differential Equations). If we consider a fixed, discrete delay t, a DDE can be written as 


dc(t 
ae = c(t —1). (12.16) 
Therefore, in order to calculate the value of c at t = 0, one needs to know c at t = —T. 


The simplest method to solve a DDE is the step method. Let us consider, as an example, 
the case 


dc(t) _ 
dt 


2-c(t —3) (12.17) 


and suppose that c(t) = 5, t € [—3, 0]. Therefore, for the time interval [0, 3] (our first 
step) we have that 


dc(t) . 
dt 


510. (12.18) 
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whose solution is 
c(t) = 10f +c}. (12.19) 
In order to determine cj, we have to ask that 
c(0) =5=10-04+¢e, = c( =S. (12.20) 


Finally, we can write that c(t) = 10¢ +5, t € [0, 3]. For the next step, t € [3, 6], we have 


dc(t) 
dt =2-[10- (t — 3) +5] = 20r — 50 (12.21) 
whose solution is 
c(t) = 10¢7 — 50t + cp (12.22) 
and c2 is given by 
35 = c(3) = 90 — 150 + c2 =} co = 95. (12.23) 


Hence, c(t) = 10f7 — 50t + 95, t € [3,6]. Notice that c(t)? is discontinuous at 
t = 0; c (t) is discontinuous at t = 3 etc. This means that the discontinuity in c(t) 
propagates. One should take this fact into account in order to avoid problems with 
numerical integration. 


12.3. Pathway Re-engineering 
12.3.1 Modifying the Yeast Mating Pathway 


As we have seen above, Regot et al. [42] made use of the yeast mating pathway to carry 
out logic operations in vivo. In a concise representation of the yeast mating pathway (see 
Fig. 12.13), the receptor protein, which puts the cells in connection with the surrounding 
environment, gets activated upon @ factor binding. As a consequence, the receptor subunit 
Ste4 recruits the scaffold Ste5 to the membrane. In this way the PAK (p21-activated 
kinase!*) Ste20, which is localized on the membrane, can activate (i.e. phosphorylates) 
the MAPKKK"* Ste11 that is localized on the scaffold together with MAPKK Ste7 and 


126) = de(t)/dt; c’ (t) = d2c(t)/dt? ete. 
13.4 kinase is a protein that phosphorylates another protein. 


14M AP: mitogen-activated protein. 
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Fig. 12.13 Simplified scheme V Vw “factors (pheromones) 
: v"\"vV 
of the yeast mating pathway 
Receptor 
Cytoplasm 
Ailing recruitment 


Ste5 


scaffold 


Nucleus 


MAPK Fus3. Stell activates Ste7 and Ste7 activates Fus3 in a phosphorylation cascade. 
Afterwards, the signal is transmitted to the nucleus where the Fus/ promoter (pFus1) is 
activated. A scaffold protein has, therefore, the function of directing a flow of information 
from the receptor protein on the cell membrane to the nucleus. 

Bashor et al. [7] constructed various synthetic gene circuits based on the yeast mating 
pathway. These circuits have in common a chimeric protein that results from the fusion of 
a leucine zipper heterodimerization module to the C-terminus of Ste5 scaffold protein. We 
refer to it as “Ste5-zipper”. The other leucine zipper heterodimerization module was fused 
to a pathway modulator protein. A positive modulator is the protein Ste50 that facilitates 
Stell activation by Ste20. In contrast, the protein Msg5 represents a negative modulator 
since it inactivates a phosphorylated Fus3.!> In order to engineer a feedback loop, one 
modulator should be produced by a promoter induced by the yeast mating pathway (pFIG1 
— see Fig. 12.14). 

A positive feedback loop increased the steady state fluorescence level and showed 
ultrasensitivity (Hill coefficient: n = 2.42). In constrast, a negative feedback caused a 
decrease in the fluorescence level at the equilibrium without any significant cooperativity 
effects (n = 1.15) if compared to the wild type pathway (n = 1.12). 

More complex behaviors were obtained by engineering different kinds of competition 
within the synthetic pathway. A pulse generator, for instance, was realized by consti- 


'5We term Ste50-zipper and Msg5-zipper the chimeric proteins where a modulator is fused to a 
leucine zipper heterodimerization module. 
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Fig. 12.14 Positive (a) and negative (b) feedback loop based on the scaffold Ste5 fused to a leucine 
zipper heterodimerization module 


tutively expressing a decoy leucine zipper that competed with the Ste5-zipper for the 
Msg5-zipper negative modulator. Msg5-zipper expression was induced by the pathway. 
The decoy had higher affinity than the scaffold to the Msg5-zipper. Hence, initially the 
decoy acted as a sink and the scaffold was poorly repressed. When the decoy got saturated, 
Msg5-zipper could bind the scaffold and repress the pathway. The pulse-generator was the 
result of a delayed negative feedback loop. Pulse sharpness was varied by modifying the 
decoy expression rate (see Fig. 12.15). 

A delay circuit arose, in contrast, by constitutively expressing Msg5-zipper and placing 
the synthesis of the high-affinity decoy under the control of the mating pathway. Initially, 
the response of the pathway was weak and only after a delay of 50 min the decoy reached a 
an amount sufficient to displace a significant quantity of the Msg5-zipper from the scaffold. 
As a consequence the circuit output (fluorescence) increased (see Fig. 12.16). 

Competition between the two modulators, Ste50-zipper and Msg5-zipper, led to the 
construction of an ultrasensitive switch when the negative modulator was constitutively 
expressed, whereas the synthesis of the positive modulator was under the control of the 
mating pathway. The circuit Hill coefficient (7 = 2.84) was slightly higher than that 
measured on the simple positive feedback loop (see above). Enhanced cooperativity was 
probably due to the conjunct effects of scaffold activation and Msg5-zipper displacement 
(by Ste50-zipper). 
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Fig. 12.15 Pulse generator circuit: scheme and dynamics [7] 


Finally, constitutive expression of the Ste50-zipper and pathway-controlled production 
of the Msg5-zipper!® provoked an acceleration in the pathway response time. It should 
be noted, however, that the synthetic and the wild-type pathway did not share the same 
fluorescence level at the equilibrium. 


12.3.2 Light-Sensitive Circuits 


Signalling pathways can be re-engineered by modifying the receptor proteins too. 
Levskaya et al. [30] built a chimeric light receptor in E. coli by fusing a cyanobacterial 
photoreceptor to the F. coli intracellular histidine kinase domain, EnvZ. Thus, the E. coli 
two-component system EnvZ-OmpR, which normally responds to osmotic shocks, was 
rewired to respond to light. Moreover, the ompC promoter (pOmpC) regulated by this 


16 this circuit, the Ste50 and Msg5 were not fused to the same leucine zipper heterodimerization 
module. The module bound to Msg5 conferred to the negative modulator a higher affinity to the 
synthetic Ste5-zipper. 
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Fig. 12.16 Delay circuit. Msg5-zipper is expressed under the constitutive promoter pSTES. Faded 
lines and shapes symbolizes lower repression of the Ste5 scaffold [7] 


system was fused to a LacZ sequence that catalyzed the production of a black precipitate 
in the presence of the chemical S-gal. 

In the absence of light, EnvZ autophosphorylates. This signal is transmitted through 
the pathway to the ompC promoter that is activated and leads the production of a black 
compound. When E. coli cells are exposed to light, EnvZ autophosphorylation ceases, 
pOmpC is no longer activated and, after some time, the cells lose the dark color. 

Levskaya et al. [30] grew a lawn of re-engineered E. coli cells on an agar plate. Then, 
they applied on top of the plate a mask carrying a specific pattern (the word “nature’”’) and 
exposed the cells to light. In this way, it was possible to re-create, by contrast, on the plate 
the same pattern present on the mask (see Fig. 12.17). 


Take Home Message 

¢ Gene circuit motifs are made of few interacting transcription units arranged in particular 
configurations. They carry out precise functions. 

¢ A positive feedback loop slows down the response time of a circuit and permits to 
engineer bistability. 

¢ A negative feedback loop accelerates the response time of a circuit and stabilizes protein 
concentration. 
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Fig. 12.17 Synthetic pathway that makes E. coli cells sensitive to light 


¢ Oscillators require a closed-loop gene structure. The overall sign of their interactions 
should be a minus. 

¢ Boolen gates (logic operations) can be achieved via promoter engineering (transcrip- 
tion regulation) or mRNA structural modifications with ribozymes and riboswitches 
(translation regulation). 

e Dimer systems provide an alternative way of designing AND Boolean gates. 

¢ Genetic multi-step cascades serve to create a delay in protein expression. Moreover, 
they show ultrasensitivity (switch-like behavior). 

¢ Feed forward loops are motifs made of three genes. I1 FFL works as a pulse generator. 
Moreover, it speeds up the response time of a circuit. 

¢ Motifs such as Boolean gates and pulse generators have been engineered in cell 
consortia as well. Cells in a consortium are split into groups that perform different 
tasks (e.g. sensing the input chemicals, producing the circuit output). 

¢ Signaling pathways can be rewired to detect signals and express proteins different from 
the “original” ones. Applications have been made both in bacteria and yeast. 
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